Determining the Number of Degrees of Freedom

(Updated September 26, 2018)

Degrees of freedom in SEM reflect the complexity vs. parsimony of a model. We start off with the number of known pieces of information (elements) regarding the manifest (or measured) indicator variables included in your model. The number of known elements is simply the number of variances and covariances (unstandardized correlations) among the measured variables (some applications also count the means of the measured variables). Known elements are shown in the yellow and light-blue matrix below. From the number of known elements, we subtract the number of freely estimated parameters (i.e., paths, correlations, variances) in the structural equation model. Fixed parameters (e.g., a factor loading set to 1) are not counted in determining degrees of freedom.

We can use the analogy of a bank account. The known elements are like the dollars in your account. For each freely estimated parameter you estimate, however, you have to "withdraw" a dollar. The degrees of freedom are thus the number of dollars you still have in your account after withdrawing all the dollars you need to implement the freely estimated parameters. Accordingly, the greater number of paths you estimate, the lower the df. In a saturated (or just-identified) model, in which all possible pathways that could be estimated are estimated, the df will be zero. (A saturated model will fit the data perfectly, not due to any accomplishment, but simply as a mathematical property.) As we will see, some measures of model fit (e.g., Comparative Fit Index, Tucker-Lewis Index) take the model's df into account.


The number of known elements in your input variance/covariance matrix (not including means) can be determined from the following equation, where "I" is the number of manifest indicators:


Also, in your SEM printouts, the number of freely estimated parameters can be observed by how many parameters have significance tests (i.e., estimates, critical ratios, and p levels). A fixed parameter will not have a significance test.

***

In counting up the number of freely estimated parameters in a model, the distinctions between construct residual variances and construct variances, and between indicator residual variances and indicator variances, can be confusing. Here's a little more explanation.

Recall that anything (construct or indicator) that has an incoming unidirectional arrow from something else in the model gets a residual variance ("bubble" or "headphone"). In the first example in the following photo, a latent construct (large oval) has an incoming direct arrow (shown at left) from some hypothetical predictor variable. Let's say the predictor accounts for 45% of the variance in the shown construct (like an R-squared in regression). The residual (unaccounted for) variance in the bubble would thus be 55%. Because the variance accounted for (R-squared) and unaccounted for (residual) in a dependent measure must sum to 100%, the R-squared and residual variances are redundant. If you know one is 45%, the other must be 55%, and vice-versa. There is thus no need to include both variances in the model. By SEM convention, the variance in such a situation is "housed" in the residual bubble (indicated by an asterisk * in the photo), which is called a "construct residual variance."


Similar reasoning holds in the third pictured scenario. Each manifest indicator (rectangle) has variance accounted for by the construct, as well as residual variance. Again, each indicator's variance in this scenario is housed in the residual bubble (see asterisks), and is known as an "indicator residual variance."

Either a construct (second example) or stand-alone indicator variable (fourth example) may have no incoming unidirectional arrows, and only outgoing unidirectional arrows. In these situations, lacking a residual bubble, the variance is housed in either the construct or indicator itself.

***

For our CFA assignment on the Hendrick and Hendrick Love Attitudes Scale (Love Styles), the photo below shows  how we would determine the df. There are two clarifications before we look at the photo:

  • The known elements appear in the red matrix on the right-hand side of the photo. With 24 measured variables (six love-style constructs with four items each in the short-form of the Love Attitudes Scale), there will be 24 variances and 276 covariances (correlations), yielding 300 known elements or "dollars."
  • The freely estimated parameters appear on the left-hand side of the photo. The way we would typically run this CFA in AMOS and Mplus, there would be 18 freely estimated factor loadings (one loading per factor being fixed at 1) and 6 freely estimated construct variances. Alternatively, all 24 factor loadings could be freely estimated and all 6 construct variances fixed to 1. Either way, the domains of factor loadings and construct variances would add up to 24 freely estimated parameters. The total number of freely estimated parameters would also include correlations between constructs (or factors) and indicator residuals (tiny bubbles or headphones). Ultimately, the model has 63 freely estimated parameters (or dollar expenditures). The df are 300-63 = 237.