This review was developed to introduce the essential components and variants of structural equation modeling (SEM), synthesize the common issues in SEM applications, and share our views on SEM’s future in ecological research.
We searched the Web of Science on SEM applications in ecological studies from 1999 through 2016 and summarized the potential of SEMs, with a special focus on unexplored uses in ecology. We also analyzed and discussed the common issues with SEM applications in previous publications and presented our view for its future applications.
We searched and found 146 relevant publications on SEM applications in ecological studies. We found that five SEM variants had not commenly been applied in ecology, including the latent growth curve model, Bayesian SEM, partial least square SEM, hierarchical SEM, and variable/model selection. We identified ten common issues in SEM applications including strength of causal assumption, specification of feedback loops, selection of models and variables, identification of models, methods of estimation, explanation of latent variables, selection of fit indices, report of results, estimation of sample size, and the fit of model.
In previous ecological studies, measurements of latent variables, explanations of model parameters, and reports of key statistics were commonly overlooked, while several advanced uses of SEM had been ignored overall. With the increasing availability of data, the use of SEM holds immense potential for ecologists in the future.
Structural equation modeling (SEM) is a powerful, multivariate technique found increasingly in scientific investigations to test and evaluate multivariate causal relationships. SEMs differ from other modeling approaches as they test the direct and indirect effects on pre-assumed causal relationships. SEM is a nearly 100-year-old statistical method that has progressed over three generations. The first generation of SEMs developed the logic of causal modeling using path analysis (Wright 1918, 1920, 1921). SEM was then morphed by the social sciences to include factor analysis. By its second generation, SEM expanded its capacity. The third generation of SEM began in 2000 with Judea Pearl’s development of the “structural causal model,” followed by Lee’s (2007) integration of Bayesian modeling (also see Pearl 2003).
Ecologists have enlisted SEM over the past 16 years to test various hypotheses with multiple variables. SEM can analyze the complex networks of causal relationships in ecosystems (Shipley 2002; Grace 2006). Chang (1981) and Maddox and Antonovics (1983) were among the first ecologists who employed SEM in ecological research, clarifying the logical and methodological relationships between correlation and causation. Grace (2006) provided the first comprehensive book on SEM basics with key examples from a series of ecosystem studies. Now, in the most recent decade, a rapid increase of SEM in ecological sciences has been witnessed (Eisenhauer et al. 2015).
SEM is a combination of two statistical methods: confirmatory factor analysis and path analysis. Confirmatory factor analysis, which originated in psychometrics, has an objective to estimate the latent psychological traits, such as attitude and satisfaction (Galton 1888; Pearson and Lee 1903; Spearman 1904). Path analysis, on the other hand, had its beginning in biometrics and aimed to find the causal relationship among variables by creating a path diagram (Wright 1918, 1920, 1921). The path analysis in earlier econometrics was presented with simultaneous equations (Haavelmo 1943). In the early 1970s, SEM combined the two aforementioned methods (Joreskog 1969, 1970, 1978; Joreskog and Goldberger 1975) and became popular in many fields, such as social science, business, medical and health science, and natural science.
This review is an update on Grace et al. (2010) and Eisenhauer et al. (2015), who both provided a timely and comprehensive review of SEM applications in ecological studies. This review differs from the above two reviews, which focused on general ecological papers with SEM from 1999 through 2016. More so, Eisenhauer et al. (2015) only focused on SEM applications in soil ecology before 2012. In this review, we included SEM basic applications—as SEM remains unknown to many ecologists—and summarized the potential applications for SEM models that are often overlooked, including the issues and challenges in applying SEM. We developed our review around three critical questions: (1) is the use of SEM in ecological research statistically sound; (2) what are the common issues facing SEM applications; and (3) what is the future of SEM in ecological studies?
Path analysis was developed to quantify the relationships among multiple variables (Wright 1918, 1920, 1921). It was the early name for SEM before there were latent variables, and was very powerful in testing and developing the structural hypothesis with both indirect and direct causal effects. However, the two effects have recently been synonymized. Path analysis can explain the causal relationships among variables. A common function of path analysis is mediation, which assumes that a variable can influence an outcome directly and indirectly through another variable. For example, light intensity (PAR), air temperature (Ta), and aboveground temperature (Ts) can influence net ecosystem exchange (NEE) indirectly through respiration (Re); yet PAR and Ts can influence Re directly (Fig. 1, Shao et al. 2016). Santibáñez-Andrade et al. (2015) applied mediation to evaluate the direct and indirect causes of degradation in the forests of the Magdalena river basin adjacent to Mexico City. The study sought to integrate abiotic controls and disturbance pressure with ecosystem conservation indicators to develop strategies in preserving biodiversity. In another study with SEM, a 23-year field experiment on a plant community in an Alaskan floodplain, found that alder inhibited spruce growth in the drier site directly, while at the wetter site it inhibited growth indirectly through effects mediated by competition with other vegetation and herbivory (Chapin et al. 2016).
Latent and observable variables
Measuring an abstract concept, such as “climate change,” “ecosystem structure and/or composition,” “resistance and resilience,” and “ecosystem service,” can pose a problem for ecological research. While direct measurements or units for these abstract concepts may not exist, statistical methods can derive these values from other related variables. SEM applies a confirmatory factor analysis to estimate latent constructs. The latent variable or construct is not in the dataset, as it is a derived common factor of other variables and could indicate a model’s cause or effect (Hoyle 1995, 2011; Grace 2006; Kline 2010; Byrne 2013). For example, latent variables were applied to conclude the natural and social effects on grassland productivity in Mongolia and Inner Mongolia, China (Chen et al. 2015). When examining the potential contributions of land use, demographic and economic changes on urban expansion (i.e., green spaces) in the city of Shenzhen, China, Tian et al. (2013) treated land cover change (LCC), population, and economy as three latent variables, each characterized with two observable variables. Economy was found to play a more important role than population in driving LCC. Liu et al. (2016) measured the functional traits of trees as a latent variable based on tree height, crown diameter, wood diameter, and hydraulic conductivity. In addition to latent and observable variables, Grace and Bollen (2008) introduced composite variables for ecological applications of SEM. Composite variables are also unobservable variables, but which assume no error variance among the indicators and is not estimated by factor analysis. Instead of extracting the factors from a set of indicators, compost variable is an exact linear combination of the indicator variables based on given weights. For example, Chaudhary et al. (2009) conducted a study on the ecological relationship in semiarid scrublands and measured fungal abundance, which is composed of hyphal density and the concentration of Bradford-reactive soil proteins, as a composite variable. Jones et al. (2014) applied soil minerals as a composite variable to represent the concentrations of zinc, iron, and phosphorus in soil.
Confirmatory factor analysis
Confirmatory factor analysis (CFA) is the method for measuring latent variables (Hoyle 1995; 2011; Kline 2010; Byrne 2013). It extracts the latent construct from other variables and shares the most variance with related variables. For example, abiotic stress as a latent variable is measured by the observation of soil changes (i.e., soil salinity, organic matter, flooding height; Fig. 2, Grace et al. 2010). Confirmatory factor analysis estimates latent variables based on the correlated variations of the dataset (e.g., association, causal relationship) and can reduce the data dimensions, standardize the scale of multiple indicators, and account for the correlations inherent in the dataset (Byrne 2013). Therefore, to postulate a latent variable, one should be concerned about the reason to use a latent variable. In the abiotic stress example given above, community stress and disturbance are latent variables that account for the correlation in the dataset. Shao et al. (2015) applied CFA to constrict the soil-nutrition features to one variable that accounted for soil organic carbon, litter total nitrogen, and carbon-to-nitrogen ratio. Also, Capmouteres and Anand (2016) defined the habitat function as an environmental indicator that explained both plant cover and native bird abundance for the forest ecosystems by using CFA.
In addition to CFA, there is another type of factor analysis: exploratory factor analysis (EFA). The statistical estimation technique is the same for both. The CFA is applied when the indicators for each latent variable is specified according to the related theories or prior knowledge (Joreskog 1969; Brown 2006; Harrington 2009), whereas EFA is applied to find the underlying latent variables. In practice, EFA is often performed to select the useful underlying latent constructs for CFA when there is little prior knowledge about the latent construct (Browne and Cudeck 1993; Cudeck and Odell 1994; Tucker and MacCallum 1997).
SEM is composed of the measurement model and the structural model. A measurement model measures the latent variables or composite variables (Hoyle 1995, 2011; Kline 2010), while the structural model tests all the hypothetical dependencies based on path analysis (Hoyle 1995, 2011; Kline 2010).
There are five logical steps in SEM: model specification, identification, parameter estimation, model evaluation, and model modification (Kline 2010; Hoyle 2011; Byrne 2013). Model specification defines the hypothesized relationships among the variables in an SEM based on one’s knowledge. Model identification is to check if the model is over-identified, just-identified, or under-identified. Model coefficients can be only estimated in the just-identified or over-identified model. Model evaluation assesses model performance or fit, with quantitative indices calculated for the overall goodness of fit. Modification adjusts the model to improve model fit, i.e., the post hoc model modification. Validation is the process to improve the reliability and stability of the model. Popular programs for SEM applications are often equipped with intuitive manuals, such as AMOS, Mplus, LISREI, Lavaan (R-package), piecewiseSEM (R-package), and Matlab (Rosseel 2012; Byrne 2013; Lefcheck 2015). The specific details for SEM applications are complicated, but users can seek help from tutorials provided by Grace (2006) and Byrne (2013).
Model evaluation indices
SEM evaluation is based on the fit indices for the test of a single path coefficient (i.e., p value and standard error) and the overall model fit (i.e., χ2, RMSEA). From the literature, the usability of model fit indices appears flexible. Generally, the more fit indices applied to an SEM, the more likely that a miss-specified model will be rejected—suggesting an increase in the probability of good models being rejected. This also suggests that one should use a combination of at least two fit indices (Hu and Bentler 1999). There are recommended cutoff values for some indices, though none serve as the golden rule for all applications (Fan et al. 1999; Chen et al. 2008; Kline 2010; Hoyle 2011).
Chi-square test (χ2): χ2 tests the hypothesis that there is a discrepancy between model-implied covariance matrix and the original covariance matrix. Therefore, the non-significant discrepancy is preferred. For optimal fitting of the chosen SEM, the χ2 test would be ideal with p > 0.05 (Bentler and Bonett 1980; Mulaik et al. 1989; Hu and Bentler 1999). One should not be overly concerned regarding the χ2 test because it is very sensitive to the sample size and not comparable among different SEMs (Bentler and Bonett 1980; Joreskog and Sorbom 1993; Hu and Bentler 1999; Curran et al. 2002).
Root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR): RMSEA is a “badness of fit” index where 0 indicates the perfect fit and higher values indicate the lack of fit (Brown and Cudeck 1993; Hu and Bentler 1999; Chen et al. 2008). It is useful for detecting model misspecification and less sensitive to sample size than the χ2 test. The acceptable RMSEA should be less than 0.06 (Browne and Cudeck 1993; Hu and Bentler 1999; Fan et al. 1999). SRMR is similar to RMSEA and should be less than 0.09 for a good model fit (Hu and Bentler 1999).
Comparative fit index (CFI): CFI represents the amount of variance that has been accounted for in a covariance matrix. It ranges from 0.0 to 1.0. A higher CFI value indicates a better model fit. In practice, the CFI should be close to 0.95 or higher (Hu and Bentler 1999). CFI is less affected by sample size than the χ2 test (Fan et al. 1999; Tabachnick and Fidell 2001).
Goodness-of-fit index (GFI): The range of GFI is 0–1.0, with the best fit at 1.0. Because GFI is affected by sample size, it is no longer recommended (MacCallum and Hong 1997; Sharma et al. 2005).
Normed fit index (NFI): NFI is highly sensitive to the sample size (Bentler 1990). For this reason, NFI is no longer used to assess model fit (Bentler 1990; Hoyle 2011).
Tucker-Lewis index (TLI): TLI is a non-normed fit index (NNFI) that partly overcomes the disadvantages of NFI and also proposes a fit index independent of sample size (Bentler and Bonett 1980; Bentler 1990). A TLI of >0.90 is considered acceptable (Hu and Bentler 1999).
Akaike information criterion (AIC) and Bayesian information criterion (BIC): AIC and BIC are two relative measures from the perspectives of model selection rather than the null hypothesis test. AIC offers a relative estimation of the information lost when the given model is used to generate data (Akaike 1974; Kline 2010; Hoyle 2011). BIC is an estimation of how parsimonious a model is among several candidate models (Schwarz 1978; Kline 2010; Hoyle 2011). AIC and BIC are not useful in testing the null hypothesis but are useful for selecting the model with the least overfitting (Burnham and Anderson 2004; Johnson and Omland 2004).
Powerful yet unexplored SEMs
Experimental and observational databases in ecological studies are often complex, non-randomly distributed, are hierarchically organized and have spatial and temporal constraints (i.e., potential autocorrelations). While corresponding SEMs exist for each type of unique data, these powerful and flexible SEMs have not yet been widely explored in ecological research. Here we introduce some unexplored SEM uses for future endeavors.
Latent growth curve (LGC) model
LGC models can be used to interpret data with serial changes over time. The LGC model is built on the assumption that there is a structure growing along with the data series. The slope of growth is a latent variable, which represents the change in growth within a specified interval, and the loading factors are a series of growing subjects specified by the user (Kline 2010; Hoyle 2011; Duncun et al. 2013).
There are few ecological publications using LGC models. However, we found a civil engineering study on water quality applying the LGC model to examine the acidic deposition from acid rain in 21 stream sites across the Appalachian Mountain Region from 1980 to 2006 (Chen and Lin 2010). This study estimated the time-varying latent variable for each stream as the change of water properties over time by using the LGC model. Because longitudinal data (e.g., time series) is common in ecological research, LGC is especially effective in testing time-varying effects (Duncan et al. 2013; Kline 2010; Hoyle 2011).
In addition to LGC, SEM can be incorporated into a time series analysis (e.g., autoregressive integrated moving average model). For example, Almaraz (2005) applied a time series SEM to predict the population growth of the purple heron (Ardea purpurea). The moving average process was used as a matrix of time-based weights for analyzing the seasonal changes and autocorrelations.
From an ecological perspective, LGC is more plausible than the conventional time series analysis, because an LGC only needs longitudinal data with more than three periods rather than a time series analysis, which requires a larger time series/more observations (e.g., time series of economic or climatic changes). The LGC assumes a stable growth curve of the observation. Therefore, users can weigh the curve based on the time span rather than time series, which requires steady intervals in the series. For further guidance, refer to the book written by Bollen and Curran (2006).
Bayesian SEM (BSEM)
BSEM assumes theoretical support and that the prior beliefs are strong. One can use new data to update a prior model so that posterior parameters can be estimated (Raftery 1993; Lee 2007; Kaplan and Depaoli 2012). The advantage of BSEM is that it has no requirements on sample size. However, it needs prior knowledge on data distribution and parameters. Arhonditsis et al. (2006) applied BSEM to explore spatiotemporal phytoplankton dynamics, with a sample size of <60. The estimation of the model parameters’ posterior distribution is based on various Monte Carlo simulations to compute the overall mean and a 95% confidence interval. Due to the Bayesian framework, the model assessment of BSEM is more like a model comparison that is not based on χ2, RMSEA, CFI, etc. There are many comparison methods for the Bayesian approach. BIC is widely used, and many statisticians suggest posterior predictive checking to estimate the predictive ability of the model (Raftery 1993; Lee 2007; Kaplan and Depaoli 2012). The SEM analysis, which uses maximum likelihood (ML) and the likelihood ratio χ2 test, often strictly rejects the substantive theory and unnecessarily utilizes model modification to improve the model fit by chance. Therefore, the Bayesian approach has received escalating attention in SEM applications due to its flexibility and better representation of the theory.
Partial least square SEM (PLS-SEM)
PLS-SEM is the preferred method when the study object does not have a well-developed theoretical base, particularly when there is little prior knowledge on causal relationship. The emphasis here is about the explorations rather than confirmations. PLS-SEM requires neither a large sample size nor a specific assumption on the distribution of the data, or even the missing data. Users with small sample sizes and less theoretical support for their research can apply PLS-SEM to test the causal relationship (Hair et al. 2013). The algorithm of PLS-SEM is different from the common SEM, which is based on maximum likelihood. When the sample size and data distribution of research can be hardly used by a common SEM, PLS-SEM has a more functional advantage.
By 2016, no publications on the application of PLS-SEM in ecological studies were found, according to our literature search. We recommend that users at the beginning stage or those who have fewer data apply PLS-SEM to generate the necessary evidence for causal relationship and variable selections. This will allow users to continue collecting long-term data while updating their hypotheses (Monecke and Leisch 2012).
The hierarchical model, also known as multilevel SEM, analyzes hierarchically clustered data. Hierarchical SEM can specify the direct and indirect causal effect between clusters (Curran 2003; Mehta and Neale 2005; Kline 2010). It is common for an experiment to fix some variables constantly, resulting in multiple groups or a nested dataset. The conventional SEM omits the fact that path coefficients and intercepts will potentially vary between hierarchical levels (Curran 2003; Mehta and Neale 2005; Shipley 2009; Kline 2010). This method focuses on data generated with a hierarchical structure. Therefore, the sample size needs to be large.
The application of hierarchical SEM is flexible. Take the work by Shipley (2009), for example, who analyzed the nested effects on plant growth between hierarchies, which included three clusters: site, year, and age. The causal relationship between the levels could be developed by Shipley’s d-sep test. With knowledge of a causal nested system, one can first specify the hierarchies before developing the SEM analysis within each nested structure (Curran 2003; Mehta and Neale 2005; Kline 2010, Fig. 3). The model in Fig. 3 is a confirmatory factor analysis, with model parameters varying in each hierarchy.
SEM models and variable selection
Selecting the appropriate variables and models is the initial step in an SEM application. The selection algorithm can be based on preferable variables and models according to certain statistical criteria (Burnham and Anderson 2002; Burnham et al. 2011). For example, the selection criterion could be based on fit indexes (e.g., AIC and BIC). Variable selection is also called the feature selection—a process of selecting the most relevant variables to prevent overfitting (Sauerbrei et al. 2007; Murtaugh 2009; Burnham and Anderson 2002)—and is also a required procedure for both PLS-SEM and exploratory factor analysis. For example, multiple variables (e.g., water depth, elevation, and zooplankton) were selected to predict the richness of native fish (Murtaugh 2009). For other statistical analyses, AIC- or BIC-based models are widely recommended in ecology (Johnson and Omland 2004; Sauerbrei et al. 2007; Burnham et al. 2011, Siciliano et al. 2014). For both indices, a smaller fit value is sought. Other fit indices can also be used as selection criteria. In a spatially explicit SEM exercise, Lamb (2014) suggested a preferable model from candidate models of different bin sizes based on χ2.
The remaining challenges
SEM applications from 1999 through 2016
During our literature review, our keyword search included “structural equation modeling” and “ecology” through the Web of Science and Google Scholar. We found and reviewed 146 ecological publications that applied SEM from 1999 through 2016 (Additional file 1). The use of SEM in ecological research has rapidly increased in recent years (Eisenhauer et al. 2015). It is clear that a major advantage of SEM is that it can visualize data and hypotheses in a graphic model. Almost all of these studies took advantage of this. However, some SEM applications needed to be improved. Some studies did not report the necessary information such as the R2 or p values of path coefficients (i.e., 22.6% reported R2, 65.8% reported p value), model modification/validations, nor an explanation of latent variables in SEMs (i.e., none explained the latent variable estimation, 28.1% did not have an estimation method). More so, 93.2% of the publications did not justify their model selection (Table 1).
Issues in SEM applications
Our review of the 146 publications revealed that many SEM applications needed to be improved. We summarized and separated these issues into ten categories (Tables 1 and 2).
Evidence of causal relationships
The test of causal relationships is central to SEM. The first step of SEM is to specify the causal relationships and correlations among the variables. Causal relationship and correlations without proper justification or theoretical foundations undermine the causal relationship in the hypotheses (Shipley 2002). The majority of the papers (94.2%) provided theoretical bases for their causal and correlation assumptions, while the remaining did not (Table 1).
Bollen and Pearl (2013) stated that strong causal relationships are made by (1) “imposing zero coefficients” and (2) “imposing zero covariance” to the model. They stated that
Strong causal assumptions assume that parameters take specific values. For instance, a claim that one variable has no causal effect on another variable is a strong assumption encoded by setting the coefficient to zero. Or, if one assumes that two disturbances are uncorrelated, then we have another strong assumption that the covariance equals zero.
A hypothesized model is composed of causal relationship and correlation assumptions, both of which should be stated clearly in any research based on design, prior experiences, scientific knowledge, logical arguments, temporal priorities, or other empirical evidence. It is notable that adding a non-zero covariance can improve some of the model fit indices. However, some studies took advantage of this by adding non-zero covariance without theoretical support, making the non-zero covariance less meaningful—even harmful—for a hypothesis testing.
Feedback is a basic ecosystem dynamic, which implies a cyclic phenomenon. The feedback loop is a useful function provided by SEM that could be either direct (i.e., V1 ⇄ V2) or indirect (i.e., V1 → V2 → V3 → V1, Fig. 4). As useful as this approach may be, there were only a couple studies that applied feedback loops. This is likely because the definition of a feedback loop can easily confuse a new user. Kline (2006) listed two assumptions for feedback loops:
One is that of equilibrium, which means that any changes in the system underlying a feedback relation have already manifested their effects and that the system is in a steady state (Heise 1975). The other assumption is that the underlying causal structure does not change over time.
Some data are generated naturally from the ecosystem without artificial manipulation. The specification of the cause and outcome of ecological dynamics is confusing because the underlying mechanisms of data generation are complex and simultaneous. The applications of feedback loops, which specify the causal relationship in a loop, can explain the ecological dynamics in a cyclical perspective. When a research design is based on a loop perspective, the SEM analysis can evaluate if the cycle is virtuous, vicious, or neutral.
Model and variable selection
As argued by Box (1976), it is difficult to find a completely correct model, but a simple model could represent a complicated phenomenon. Therefore, one needs to select cautiously the model and variables based on the research goal, the statistical foundation, and the theoretical support. In our review, only a few papers applied a model (6.8%) or a variable (8.9%) selection (Table 2). The model and variable selection is key to multivariable analysis. One should demonstrate the principle of model postulation in addition to research design. Indeed, there were very few papers discussing the technique and principle of their models. A well-applied principle of parsimony for model users emphasizes the simplicity of a model. According to this principle, the users should justify if a model could present a phenomenon by a few variables. Cover and Thomas (2012) had proposed other modeling principles.
Model identification was often overlooked, with only 67.8% reporting the model identification, and happened when latent variables were estimated. Kline (2010) proposed three essential requirements when identifying the appropriate SEM: (1) “the model degrees of freedom must be at least zero to assure the degrees of freedom (df) is greater than zero”; (2) “every latent variable (including the residual terms) must be assigned a scale, meaning that either the residual terms’ (disturbance) path coefficient and one of the latent variable’s factor loading should be fixed to 1 or that the variance of a latent variable should be fixed to 1”; and (3) “every latent variable should have at least two indicators.”
Most publications provided the df values in their SEMs and we estimated the df of those that did not report. All publications had a df greater than zero. All the models with CFA met the requirement that each latent variable should have at least two indicators. However, many studies skipped over scaling the latent variables before estimation, resulting in non-robust results. The unscaled latent variable can hardly provide useful information to the causal test. Otherwise, it is likely that the user had just fit the model by chance.
Many estimation methods in SEM exist, such as maximum likelihood (ML), generalized least squares, weighted least squares, and partial least squares. Maximum likelihood estimation is the default estimation method in many SEM software (Kline 2010; Hoyle 2011). All of the publications stated the estimation methods were based on ML, which assumes that (1) no skewness or kurtosis in the joint distribution of the variables exists (e.g., multivariate normality); (2) the variables are continuous; and (3) there are very few missing data (i.e., <5%, Kline 2010; Hoyle 2011). However, very few publications provided this key information about their data. Instead, they simply ignored the data quality or chose not to discuss the raw data. Some papers briefly discussed the multivariate normality of their data, but none discussed the data screening and transformation (i.e., skewness or kurtosis, continuous or discrete, and missing data). We assume that most of their ecological data was continuous, yet one needs to assure the continuity of the data to support their choice of estimation methods. The partial least square method requires neither continuous data nor multi-normality.
Explanations of the measured latent variables
We did not find a publication with sufficient explanation for its CFA in regard to the prior knowledge or preferred function (i.e., unmeasured directly, quantifiable, and necessary to the model) for measuring the latent variable. Factor analysis is a useful tool for dimension reduction. The factor analysis applied in SEM measurement models (CFA or EFA) are used to measure the latent variable, which requires a theoretical basis. The prior knowledge of a measurement model includes two parts: (1) the prior knowledge of indicators for a latent variable and (2) the prior knowledge of the relationships between the latent variable and its indicators (Bentler and Chou 1987). For example, the soil fertility of a forest as a latent variable was estimated based on two types of prior knowledge, including (1) the observation of tree density, water resources, and presence of microorganisms and (2) the positive correlations among the three observed variables.
If the estimation of a latent variable is performed without prior knowledge, CFA will become a method only for data dimension reduction. In addition, we did not find any CFAs in the ecological publications explaining the magnitude of the latent variable. Therefore, these latent variables lack a meaningful explanation in regard to the hypothesis of an SEM (Bollen 2002; Duncan et al 2013). Another issue concerning latent variables in ecological research is that some “observable variables” (e.g., salinity, pH, temperature, and density) are measured as a latent variable. The reasons are very flexible for measuring a latent variable, but they require the user to explain the application of CFA carefully.
SEM requires measurement models to be based on prior knowledge so that latent variables can be interpreted correctly (Bentler and Chou 1987). SEM is not a method to only reduce data dimensions. Instead, one should explain the magnitude and importance of indicators and latent variables. Therefore, users should base their explanations on theory when discussing the associated changes between latent variables and indicators. The explanation should include the analysis of the magnitude of the latent variable, indicators, and factor loadings.
Report of model fit indices
Reporting of fit indices in any SEM is strongly recommended and needed. Approximately 93.8% of the publications provided model fit indices. However, none justified their usage of the chosen fit indices. Those that did not report model fit indices also did not provide the reason for doing so. From these publications, χ2, CFI, RMSEA, TLI, GFI, NFI, SRMR, AIC, and BIC were frequently used. The χ2 was included in almost every paper because it is the robust measure for model fitness. Some publications without significant χ2 tests reported their SEM results regardless. In addition, GFI and NFI were also used even though they are not recommended as measures for model fit.
Fit indices are important indicators of model performances. Due to their different properties, they are sensitive to many factors, such as data distribution, missing data, model size, and sample size (Hu and Bentler 1999; Fan and Sivo 2005; Barrett 2007). Most fit indices (i.e., χ2, CFI, RMSEA, TLI, GFI, NFI, SRMR) are greatly influenced by multivariate normality (i.e., a property of ML method that is applied in SEM). Meanwhile, CFI, RMSEA, and SRMR are useful in detecting model misspecification, and relative fit indices (e.g., AIC and BIC) are mainly used for model selection (Curran et al. 1996; Fan and Sivo 2005; Ryu 2011). Selection of model fit indices in an SEM exercise is key to explaining the model (e.g., type, structure, and hypothesis). Users should at least discuss the usage of fit indices to ensure that they are consistent with their study objectives.
Report of the results
An SEM report should include all the estimation and modeling process reports. However, most publications did not include a full description of the results for their hypothesis tests. Some publications provided their SEMs based on a covariance matrix (Table 1), while even fewer studies reported the exact input covariance or correlation matrix. No study reported the multivariate normality, absence, or outliers of their data. The majority of the papers (82.2%) reported the path coefficients, but very few reported both unstandardized and standardized path coefficients. A small percentage (8.9%) of the publications reported the standard error for the path coefficient. The basic statistics (i.e., p value, R2, standard errors) are of equal importance as the overall fit indices because they explain the validity and reliability of each path, providing evidence for when the overall fit is poor (Kline 2010; Hoyle 2011).
Hoyle and Isherwood (2013) suggested that a publication with an SEM analysis should follow the Journal Article Reporting Standards of the American Psychological Association. The reporting guidelines are comprised of five components (McDonald and Ho 2002; Jackson et al. 2009; Kline 2010; Hoyle and Isherwood 2013):
Model specification: Model specification process should be reported, including prior knowledge of the theoretically plausible models, prior knowledge of the positive or negative direct effects among variables, data sampling method, sample size, and model type.
Data preparation: Data processing should be reported, including the assessment of multivariate normality, analysis of missing data, method to address missing data, and data transformations.
Estimation of SEM: The estimation procedure should be reported, including the input matrix, estimation method, software brand and version, and method for fixing the scale of latent variables.
Model evaluation and modification: The model evaluation should be reported, including fit indices with cutoff values and model modification.
Reports of findings: All of the findings from an SEM analysis should be reported, including latent variables, factor loadings, standard errors, p values, R2, standardized and unstandardized structure coefficients, and graphic representations of the model.
The estimation of sample size is another issue for the SEM application. So far the estimation of sample size is flexible, and users could refer to several authors’ recommendations (Fan et al. 1999; Muthen and Muthen 2002; Iacobucci 2010). While some (61.0%) studies reported the sample size clearly, none of them provided a justification for the sample size with sound theory (Table 2). Technically, sample size for an SEM varies depending on many factors, including fit index, model size, distribution of the variables, amount of missing data, reliability of the variables, and strength of path parameters (Fan et al. 1999; Muthen and Muthen 2002; Fritz and MacKinnon 2007; Iacobucci 2010). Some researchers recommend a minimum sample size of 100–200 or five cases per free parameter in the model (Tabachnick and Fidell 2001; Kline 2010). One should be cautious when applying these general rules, however. Increasingly, use of model-based methods for estimation of sample size is highly recommended, with sound methods based on fit indices or power analysis of the model. Muthen and Muthen (2002) developed a method based on the Monte Carlo simulation to utilize SEM’s statistical power analysis and calculate sample size (Cohen 2013). Kim (2005) developed equations to compute the sample size based on model fit indices for a given statistical power.
We did not find that SEM was validated in the reviewed ecological studies, even though it is a necessary process for quantitative analysis. This is probably because most SEM software is developed without model validation features. The purpose of model validation is to provide more evidence for the hypothetical model. The basic method of model validation is to test a model by two or more random datasets from the same sample. Therefore, the validation requires a large sample size. The principle of the model validation is to assure that the parameters are similar when a model is based on different datasets from the same population. This technique is a required step in many learning models. However, it is still unpopular in SEM applications.
SEM is a powerful multivariate analysis tool that has great potential in ecological research, as data accessibility continues to increase. However, it remains a challenge even though it was introduced to the ecological community decades ago. Regardless of its rapidly increased application in ecological research, well-established models remain rare. In fact, well-established models can serve as a prior model, as this has been extensively used in psychometrics, behavioral science, business, and marketing research. There is an overlooked yet valuable opportunity for ecologists to establish an SEM representing the complex network of any ecosystem.
The future of SEM in ecological studies
Many ecological studies are characterized by large amounts of public data, which need multivariate data analysis. SEM users are provided with this opportunity to look for suitable public data and uncover patterns in research. However, big data will also inevitably bring new issues, such as the uncertainty of data sources. Therefore, improved data preparation protocols for SEM research are urgently needed. Fortunately, the exponential growth of usage in data-driven models, such as machine learning, provides SEM users a promising opportunity to develop creative methods to combine hypothesis-based and data-driven models together.
The growing availability of big data is transforming studies from hypothesis-driven and experiment-based research to more inductive, data-driven, and model-based research. Causal inference derived from data itself with learning algorithms and little prior knowledge has been widely accepted as accurate (Hinton et al. 2006; LeCun et al. 2015). The original causal foundation of SEM was based on a hypothesis test (Pearl 2003, 2009, 2012; Bareinboim and Pearl 2015). However, with the advancement of data mining tools, the data-driven and hypothesis-driven models may be mixed in the future. Here, we emphasize the importance of utilizing hypothesis-based models that are from a deductive-scientific stance, with prior knowledge or related theory. Meanwhile, we also agree that new technologies such as machine learning under big data exploration will stimulate new perspectives on ecological systems. On the other hand, the increased data availability and new modeling approaches—as well as their possible marriage with SEM—may skew our attention towards phenomena that deliver easily accessible data, while consequently obscuring other important phenomena (Brommelstroet et al. 2014).
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19(6):716–723
Chen J, John R, Shao C, Fan Y, Zhang Y, Amarjargal A, Brown DG, Qi J, Han J, Lafortezza R, Dong G (2015) Policy shifts influence the functional changes of the CNH systems on the Mongolian plateau. Environ Res Lett 10(8):085003
Curran PJ, Bollen KA, Paxton P, Kirby J, Chen F (2002) The noncentral chi-square distribution in misspecified structural equation models: finite sample results from a Monte Carlo simulation. Multivar Behav Res 37(1):1–36
Santibáñez-Andrade G, Castillo-Argüero S, Vega-Peña E, Lindig-Cisneros R, Zavala-Hurtado J (2015) Structural equation modeling as a tool to develop conservation strategies using environmental indicators: the case of the forests of the Magdalena river basin in Mexico City. Ecol Indic 54:124–136
Shao Y, Bao W, Chen D, Eisenhauer N, Zhang W, Pang X, Xu G, Fu S (2015) Using structural equation modeling to test established theory and develop novel hypotheses for the structuring forces in soil food webs. Pedobiologia 58(4):137–145
Shao J, Zhou X, Luo Y, Li B, Aurela M, Billesbach D, Blanken PD, Bracho R, Chen J, Fischer M, Fu Y, Gu L, Han S, He Y, Kolb T, Li Y, Nagy Z, Niu S, Oechel W, Pinter K, Shi P, Suyker A, Torn M, Varlagin A, Wang H, Yan J, Yu G, Zhang J (2016) Direct and indirect effects of climatic variations on the interannual variability in net ecosystem exchange across terrestrial ecosystems. Tellus B, doi: http://dx.doi.org/10.3402/tellusb.v68.30575
Sharma S, Mukherjee S, Kumar A, Dillon WR (2005) A simulation study to investigate the use of cutoff values for assessing model fit in covariance structure models. J Bus Res 58(7):935–943
Siciliano S, Anne P, Tristrom W, Eric L, Andrew B, Mark B, van Josie D, Ji M, Belinda F, Paul G, Chu H, Ian S (2014) Soil fertility is associated with fungal and bacterial richness, whereas pH is associated with community composition in polar soil microbial communities. Soil Biol Biochem 78:10–20
This study is supported by the Sustainable Energy Pathways (CHE) Program (#1230246) and the Dynamics of Coupled Natural and Human Systems (CNH) Program (#1313761) of the US National Science Foundation (NSF). We thank Dr. Zutao Ouyang for the statistical help.
YF designed and carried out the conceptual review of SEM literature. JC constructed the overall structure of the manuscript and revised the content. GS guided and revised the scientific writings. RJ carried out the review of matrices in the relevant literature and revised the manuscript. SW wrote the introduction of PLS-SEM. HP carried out the review of model fit indices in the literature. CS carried out the review of model selection. All authors read and approved the final manuscript.
Yi Fan is a graduate student of geography with research interests in data mining.
Jiquan Chen is a professor of geography with research interests in ecosystem processes and their interactive feedbacks to biophysical and human changes.
Gabriela Shirkey is a laboratory technician with interests in conservation strategies and community engagement.
Ranjeet John is a research associate with interests in remote sensing and geospatial technology.
Susie R. Wu is a research associate in geography with research interests in sustainable product design.
Hogeun Park is a doctoral student in urban planning with research interests in developing nations and the urbanization process.
Changliang Shao is a research associate with interests in ecosystem carbon, water, and energy fluxes.
All the authors are associated with Center for Global Change and Earth Observations (CGCEO) of Michigan State University.
The authors declare that they have no competing interests.
Authors and Affiliations
Center for Global Change and Earth Observations (CGCEO)/Department of Geography, Environment, and Spatial Sciences, Michigan State University, East Lansing, MI, 48824, USA
Yi Fan, Jiquan Chen, Gabriela Shirkey, Ranjeet John, Susie R. Wu, Hogeun Park & Changliang Shao
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.