Determining the Number of Degrees of Freedom

(Updated September 26, 2018)

Degrees of freedom in SEM reflect the complexity vs. parsimony of a model. We start off with the number of known pieces of information (elements) regarding the manifest (or measured) indicator variables included in your model. The number of known elements is simply the number of variances and covariances (unstandardized correlations) among the measured variables (some applications also count the means of the measured variables). Known elements are shown in the yellow and light-blue matrix below. From the number of known elements, we subtract the number of freely estimated parameters (i.e., paths, correlations, variances) in the structural equation model. Fixed parameters (e.g., a factor loading set to 1) are not counted in determining degrees of freedom.

We can use the analogy of a bank account. The known elements are like the dollars in your account. For each freely estimated parameter you estimate, however, you have to "withdraw" a dollar. The degrees of freedom are thus the number of dollars you still have in your account after withdrawing all the dollars you need to implement the freely estimated parameters. Accordingly, the greater number of paths you estimate, the lower the df. In a saturated (or just-identified) model, in which all possible pathways that could be estimated are estimated, the df will be zero. (A saturated model will fit the data perfectly, not due to any accomplishment, but simply as a mathematical property.) As we will see, some measures of model fit (e.g., Comparative Fit Index, Tucker-Lewis Index) take the model's df into account.


The number of known elements in your input variance/covariance matrix (not including means) can be determined from the following equation, where "I" is the number of manifest indicators:


Also, in your SEM printouts, the number of freely estimated parameters can be observed by how many parameters have significance tests (i.e., estimates, critical ratios, and p levels). A fixed parameter will not have a significance test.

***

In counting up the number of freely estimated parameters in a model, the distinctions between construct residual variances and construct variances, and between indicator residual variances and indicator variances, can be confusing. Here's a little more explanation.

Recall that anything (construct or indicator) that has an incoming unidirectional arrow from something else in the model gets a residual variance ("bubble" or "headphone"). In the first example in the following photo, a latent construct (large oval) has an incoming direct arrow (shown at left) from some hypothetical predictor variable. Let's say the predictor accounts for 45% of the variance in the shown construct (like an R-squared in regression). The residual (unaccounted for) variance in the bubble would thus be 55%. Because the variance accounted for (R-squared) and unaccounted for (residual) in a dependent measure must sum to 100%, the R-squared and residual variances are redundant. If you know one is 45%, the other must be 55%, and vice-versa. There is thus no need to include both variances in the model. By SEM convention, the variance in such a situation is "housed" in the residual bubble (indicated by an asterisk * in the photo), which is called a "construct residual variance."


Similar reasoning holds in the third pictured scenario. Each manifest indicator (rectangle) has variance accounted for by the construct, as well as residual variance. Again, each indicator's variance in this scenario is housed in the residual bubble (see asterisks), and is known as an "indicator residual variance."

Either a construct (second example) or stand-alone indicator variable (fourth example) may have no incoming unidirectional arrows, and only outgoing unidirectional arrows. In these situations, lacking a residual bubble, the variance is housed in either the construct or indicator itself.

***

For our CFA assignment on the Hendrick and Hendrick Love Attitudes Scale (Love Styles), the photo below shows  how we would determine the df. There are two clarifications before we look at the photo:

  • The known elements appear in the red matrix on the right-hand side of the photo. With 24 measured variables (six love-style constructs with four items each in the short-form of the Love Attitudes Scale), there will be 24 variances and 276 covariances (correlations), yielding 300 known elements or "dollars."
  • The freely estimated parameters appear on the left-hand side of the photo. The way we would typically run this CFA in AMOS and Mplus, there would be 18 freely estimated factor loadings (one loading per factor being fixed at 1) and 6 freely estimated construct variances. Alternatively, all 24 factor loadings could be freely estimated and all 6 construct variances fixed to 1. Either way, the domains of factor loadings and construct variances would add up to 24 freely estimated parameters. The total number of freely estimated parameters would also include correlations between constructs (or factors) and indicator residuals (tiny bubbles or headphones). Ultimately, the model has 63 freely estimated parameters (or dollar expenditures). The df are 300-63 = 237.     

SEM The Musical 1

The debut of "SEM The Musical" was held on April 27 and we ended up with 19 songs (lyrics below). Some video clips are also available below, shown by their respective songs and lyrics (thanks to Sothy Eng and Xiaozhi "Gigi" Zhou for their videography work). Derek Ross, an obviously talented video editor and husband of one of the SEM students, condensed our musical into a five-minute documentary, which is more like a spoof infomercial. Click here for the documentary/spoof infomercial. It's truly "must-see TV." Just to be clear: We are not selling videos!

SEM The Musical
By Dr. Alan Reifman and his Spring 2007 Quantitative Methods IV class

(Back-up vocals in parentheses)

Welcome to SEM The Musical
Lyrics by Alan Reifman
(May be sung to the tune of "Matchmaker," Bock/Harnick, from Fiddler on the Roof)

SEM, SEM, it can be sung,
You’ll be amazed, at what we’ve sprung,
We hope you’ll learn more ’bout this stats technique,
Through songs of which you’re among,

SEM, SEM, we like to run,
It takes awhile, but we get it done,
We hope you’ll learn of the steps that we take,
And take home from this, some fun…

I Am an Indicator
Lyrics by Alan Reifman
(May be sung to the tune of "The Entertainer," Billy Joel)

I am an indicator, a latent construct I represent,
I'm measurable, sometimes pleasurable,
A manifestation of what is meant,

I am an indicator, I usually come in a multiple set,
With other signs of the same construct, you may instruct,
I'm correlated with my co-indicators, you can bet,

I am an indicator, from my presence the construct is inferred,
I'm tap-able, the construct is not palpable,
The distinction should not be blurred

At Least Three
Lyrics by Alan Reifman
(May be sung to the tune of "Think of Me," Lloyd Webber/Hart/Stilgoe, from Phantom of the Opera)

(Cat Pause, lead vocals)

At least three, indicators are urged,
For each latent construct shown,
At least three, indicators should help,
Avoid output where you groan,

With less than three, your construct sure will be, locally unidentified,
Though the model might still run, you could have a rough ride

Gotta Fix It to 1
Lyrics by Alan Reifman
(May be sung to the tune of "Fortunate Son," John Fogerty)

You make a construct, with its loadings,
Can’t let them, all be free,
So that the model’s identified,
Fixing one is the key,

It ain’t free,
It ain’t free,
Gotta fix it to 1,

It ain’t free,
It ain’t free,
In AMOS, automatically done

The number of knowns in your model,
The unknowns can’t exceed,
Fixing a loading for each construct,
Will accomplish this need,

It ain’t free,
It ain’t free,
Gotta fix it to 1,

It ain’t free,
It ain’t free,
In AMOS, automatically done

Residual Variance
Lyrics by Alan Reifman
(May be sung to the tune of "I Say a Little Prayer," Bacharach/David)

Residual variance,
What variables do not share, hence,
I draw a little shape for you,

Residual variance,
What’s left after the R-square, hence,
I draw a little shape for you,

Small circles, to show the, unexplained variance,
...we will always use,
We’ll see what is left in the indicators,
And endogenous,
Constructs that we predict to...

Constrain, ’strain, ’strain...
Lyrics by Alan Reifman
(May be sung to the tune of "Chain of Fools," Don Covay, popularized by Aretha Franklin)

(Cat Pause, lead vocals)

Constrain, ’strain, ’strain (Constrain, ’strain, ’strain),
Constraints are tools (Constraints are tools),
Constrain, ’strain, ’strain (Constrain, ’strain, ’strain),
Constraints are tools (Constraints are tools),

You want to test, if two paths are equal,
You run the model once, then you run a sequel,

First, let the paths run free, they take on their own values,
A chi-square you will see, but what does it tell you?

You must...

Constrain, ’strain, ’strain (Constrain, ’strain, ’strain),
Constraints are tools (Constraints are tools),
Constrain, ’strain, ’strain (Constrain, ’strain, ’strain),
Constraints are tools (Constraints are tools),

You re-run your model, with paths fixed to be the same,
You get a new chi-square, higher than what before came,

You compare the two models, via the delta chi-square test,
If it’s significant, then the free version is best,

When you’ve...

Constrained, ’strained, ’strained (Constrained, ’strained, ’strained),
Constraints are tools (Constraints are tools),
Constrained, ’strained, ’strained (Constrained, ’strained, ’strained),
Constraints are tools (Constraints are tools)...

If it's Nested
Lyrics by Alan Reifman and Adam Munk
(May be sung to the tune of "Mandy," English/Kerr, popularized by Barry Manilow)

If you want to check and see,
If a path is necessary,
What you should do,
Is run a nested model,
Here's the steps to take,
You don't want to dawdle...

If it's nested,
You must only add paths without taking,
Or only take away paths without adding,

If it's nested,
You can compare chi-squares of the models,
And you'll see if the new path is worth adding...

Parsi-Mony
Lyrics by Alan Reifman (expanded for 2010, video)
(May be sung to the tune of "Mony Mony," Bloom/Gentry/James/Cordell)

NOTE: Parsi-Mony has come to be performed as our closing number every year. Below, Dr. Reifman chats with Tommy James, whose classic hit Mony Mony inspired Parsi-Mony. James performed at the 2013 South Plains Fair and was kind enough to stick around and visit with fans and sign autographs. Dr. Reifman tells Tommy about how he (Dr. Reifman) has written statistical lyrics to Tommy's songs for teaching purposes.  




Structural models need parsimony,
Don’t want to add paths that are phony,
Put the paths you need, now that’s all right, yeah,
You got to keep your model lean and tight, now,
...lean and tight now,
I said, yeah (audience joins), yeah, yeah, yeah, yeah,…

If you can account (PARSIMONY),
For (PARSIMONY),
The data (PARSIMONY),
With a (PARSIMONY),
Minimum of paths (PARSIMONY),
You’ve got (PARSIMONY)
Baby don't stop, seeking (PARSIMONY),
Hey, yeah, yeah, yeah, yeah, yeah, yeah,

Get up!
(brief break)

Few paths, sparse graphs, parsimony,
Above all, keep it small, parsimony,
You want to keep your model looking slim, yeah,
Don't stop now, seek out parsimony, seek parsimony!

Yeah, yeah, yeah…

If you can account (PARSIMONY),
For (PARSIMONY),
The data (PARSIMONY),
With a (PARSIMONY),
Minimum of paths (PARSIMONY),
You’ve got (PARSIMONY)
Baby don't stop, seeking (PARSIMONY),
Hey, yeah, yeah, yeah, yeah, yeah, yeah,

[interlude -- introduce our "band," thank-you's, etc.]

You want parsimony ...mo ...mo ...mony (audience repeats)
Parsimony ...mo ...mo ...mony (audience repeats)
Parsimony ...mo ...mo ...mony...(audience repeats)

Covariance
Lyrics by Alan Reifman (May be sung to the tune of "Aquarius," Rado/Ragni/MacDermot, from Hair, also popularized by the Fifth Dimension)

You draw paths to show relationships,
You hope align with the known r’s,
Your model will guide the tracings,
From constructs near to constructs far,

You will compare this with the data’s covariance,
The data’s covariance...
Covariance!
Covariance!

Similar to correlation,
With the variables unstandardized,
Does each known covariance match up with,
The one the model tracings will derive?

Covariance!
Covariance!

You’ve Got to Check Your R-M-S-E-A
Lyrics by Alan Reifman
(May be sung to the tune of "YMCA," Belolo/Morali/Willis, popularized by the Village People)

How well, does your model match up,
To the matrix of covariances? Yup,

I said, How well, can you reproduce the,
Structure... of the... variables... you see?

You’ve got to check your R-M-S-E-A,
You’ve got to check your R-M-S-E-A,
You want your value, to be very small,
Preferably below, .05 will it fall,

You’ve got to check your R-M-S-E-A,
You’ve got to check your R-M-S-E-A,
It’s one of the, best fit indices,
You can check it, with any others you please...

Check Your NFI
Lyrics by Alan Reifman
(May be sung to the tune of "Judy’s Turn to Cry," Lewis/Ross, popularized by Lesley Gore)

You’ve got to check your NFI,
...check your TLI,
...check your CFI,
’Cause none of them alone’s a hit...

You’ve just finished running your model,
And you want to know its goodness of fit,
But there’s no one single index,
That scholars consider a perfect hit,

You’ve got to check your NFI,
...check your TLI,
...check your CFI,
’Cause none of them alone’s a hit...

The standard advice is to look at,
A variety of measures of fit,
So you pick out a set of several,
Of indices, you form your own kit,

You’ve got to check your NFI,
...check your TLI,
...check your CFI,
’Cause none of them alone’s a hit...

(Instrumental)

Stand by Your Model
Lyrics by Alan Reifman
(May be sung to the tune of "Stand by Your Man," Wynette/Sherrill)

When there’s a path,
That comes out non-significant,
What should you do?
Should you eliminate this path?

Stand by your model,
It represents your best thinking,
Stand by your model,
Don’t want one that’s shrinking,

Just because you,
Didn’t find a certain result,
Keep the model intact,
A future study may support it,

Stand by your model,
It represents your best thinking,
Stand by your model,
Don’t want one that’s shrinking,

Ready to Run
Lyrics by Alan Reifman
(May be sung to the tune of "Ready to Run," Seidel/Hummon, popularized by the Dixie Chicks)

I’ve drawn my shapes and my arrows,
I’m gonna be ready this time (ready this time),
I’ve requested a standardized solution,
I’m gonna be ready this time (ready this time),

Ready, ready, ready, ready, ready, ready to run,
Error messages, I hope to see none...,
Will my assignment get done?

Gonna view my text output,
I hope that it’s correct this time (correct this time),
Got a chi-square and acceptable values,
It looks like it’s correct this time (correct this time),

Ready, ready, ready, ready, ready, ready to run,
Looking for error messages, I see none,
Running a model is fun...

If I Had Multiple Groups
Lyrics by Alan Reifman
(May be sung to the tune of “If I Had a Hammer,” Hays/Seeger)

If I had multiple groups,
I’d run them in the morning,
I’d run them in the evening,
All over this land,

I’d use cross-group constraints,
On the loadings and the structural paths,
I’d want to see if, the chi-square was so different,
From an unconstrained hand

You Had a Bad Fit
Lyrics by Lukas Dean
(May be sung to the tune of "Bad Day," Daniel Powter)

Where is the department statistician when needed the most?
Your research and theory kick up a model that's great,
You draw it in AMOS in a way that relates,
The little wand helps makes the structural paths straight,
You link it to data, now you're out of the gate,

You stand up in the lab to see how the results go,
You fake up a smile when you forgot to estimate means, oh no!

The output says your model is way off line,
Your theory's falling to pieces this time,
And I don't have no Heywood Case,

You had a bad fit,
You take some items down,
You correlate errors, just to turn it around,

You say you don't know, the numbers don't lie,
Your RMSEA was way too high,
You had a bad fit,
The numbers don't lie,
Kenny says it shouldn't be higher than .05,
You had a bad fit, you had a bad fit

An SEM Miracle
Lyrics by Alan Reifman (expanded for 2010)
(May be sung to the tune of "It’s a Miracle," Manilow/Panzer)

I ran this model hours on end,
And kept having problems with it,
There was negative variance,
Known as a Heywood Case,

Paths were unidentified,
Despite everything that I tried,
I did what, I thought would work,
The problem, I couldn't trace,

It’s a miracle (miracle),
All errors have gone away,
The model finally runs,

It was looking hazy, I was going crazy,
Till the output page came through,
It looks clear, and my fit will astound you,
So maybe, I no longer face defeat,

For the miracle (miracle),
I can start writing now,
My homework’s almost done,

I’m finally starting, now, to see,
Where I may have, really, gone astray,
I may have been missing,
A "1" for a fixed pathway,

You've got to use, the AMOS tool,
Or you're gonna look like a fool,
But now that I've done it right,
The errors just go away,

It’s a miracle (miracle),
All errors have gone away,
The model finally runs,

It was looking hazy, I was going crazy,
Till the output page came through,
It looks clear, and my fit will astound you,
And baby, I'll be dancing in the street!

Your Model’s Only One
Lyrics by Alan Reifman
(May be sung to the tune of "The Old Man Down the Road," John Fogerty)

You need a good conceptual model,
You need a nice, large sample size,
You need multiple indicators for,
Each latent construct you surmise,

Plus, you must realize,
That your model’s only one,
Of the many equal-fitting,
Models... that could be run,

Your model represents a best guess,
Causality you cannot show,
You may get some good ideas,
For an experimental way to go,

Thus, you must realize,
That your model’s only one,
You should probably look at,
The writings... of MacCallum

I Guess It Never Hurts to Winsorize
Lyrics by Kristina Keyton
(May be sung to the tune of "I Guess It Never Hurts to Hurt Sometimes," Randy VanWarmer, popularized by the Oak Ridge Boys)

Sometimes I feel the weight,
Of an outlier in my model,
It caused a Heywood case,
And it makes me want to cry,
Is there nothing we can do,
To fix this data problem,
But a memory,
Of Reifman's class saved me,

Outliers always hurt the mean,
And that's ruining my model,
But I won't give up on it,
Just because of one number,
Sometimes it makes me sad,
That we can't just say goodbye,
But I guess it never hurts to Winsorize,

We try and hold on to our moments,
But outliers can't stay,
But we can't just delete,
We lose information that way,
We can't look forward to our output,
And still hold onto bad data,
Oh I hope that you will hear me,
When I say...

Outliers always hurt the mean,
And that's ruining my model,
But I won't give up on it,
Just because of one number,
Sometimes it makes me sad,
That we can't just say goodbye,
But I guess it never hurts to Winsorize

SEM, Oh, SEM
Lyrics by Alan Reifman, dedicated to Peter Westfall (article of his)
(May be sung to the tune of "Galveston," Jimmy Webb, popularized by Glen Campbell)

Ultimately, SEM,
Your LV’s cannot be measured,
Which gives the critics some displeasure,
There’s nothing physical to grab on,
When you run SEM,

SEM, Oh, SEM,
You make many an assumption,
Is it recklessness or gumption?
Assume the e’s uncorrelated...
When you run SEM,

I can see the critics’ point of view, now,
They’re saying the models aren’t unique,

That, we must willingly acknowledge,
In response to the critique, if we want to keep on using...

SEM, Oh, SEM...

Longitudinal/Panel SEM

(Updated April 8, 2015)

Our next topic is longitudinal SEM, actually a particular type of longitudinal design called a panel study, where the same respondents are followed up over time (longitudinal panel studies should not be confused with online/consumer panels). An example of a longitudinal panel study is the University of Michigan's Panel Study of Income Dynamics. Within the longitudinal panel design, we will learn about autoregressive and cross-lagged paths. Equality constraints will play a major role here.

One of the major purposes of longitudinal panel studies is to get a good approximation of causality. Short of actual experimentation, a longitudinal panel study is probably as good a design as there is for inferring causation. A couple of lecture modules from my methods course (here and here) may be helpful, along with a 2009 article from Child Development.

The following article by Albert Farrell should also be helpful. We will go over sections of it in class.

Farrell, A.D. (1994). Structural equation modeling with longitudinal data: Strategies for examining group differences and reciprocal relationships. Journal of Consulting and Clinical Psychology, 62, 477-487.

The article actually covers both longitudinal-panel models and multiple-group models. The two are separate topics; a study can have one of these aspects and not the other. We'll also use the Farrell article to discuss multiple-group modeling, but later on.

This PowerPoint slideshow by Patrick Sturgis is also helpful.

UPDATE (April 13, 2011): I've made some new graphics to illustrate two modeling conventions associated with panel SEM.



The correlated residuals are sometimes known as the "fountain effect" for their visual appearance. The fountain at Las Vegas's Bellagio Hotel nicely illustrates this, as seen below (from GoVegas.about.com).


UPDATE (March 18, 2010): Cameron McIntosh sent a list of references on longitudinal/panel analysis to the SEMNET listserv discussion group. The list, which I've lightly edited, may be helpful for students seeking to pursue the topic in greater detail.

Little, T.D., Preacher, K.J., Selig, J.P., & Card, N.A. (2007). New developments in latent variable panel analyses of longitudinal data. International Journal of Behavioral Development, 31, 357-365. [Copy available on Dr. Preacher's publications page; see heading "Longitudinal factorial invariance."]

Collins, L.M. (2006). Analysis of longitudinal data: The integration of theoretical model, temporal design, and statistical model. Annual Review of Psychology, 57, 505-528.

Phillips, J.A., & Greenberg, D.F. (2007). A comparison of methods for analyzing criminological panel data. Journal of Quantitative Criminology, 24, 51-72.

Preacher, K.J., Wichman, A.L., MacCallum, R.C., & Briggs, N.E. (2008). Latent growth curve modeling (part of the series Quantitative Applications in the Social Sciences, vol. 157). Thousand Oaks, CA: Sage.

Bollen, K.A., & Brand, J.E. (2008). Fixed and random effects in panel data using structural equation models. Los Angeles, CA: California Center for Population Research, UCLA (online).

Wu, A.D., Liu, Y., Gadermann, A.M., & Zumbo, B.D. (2009). Multiple-indicator multilevel growth model: A solution to multiple methodological challenges in longitudinal studies. Social Indicators Research (published online).

More advanced:

Curran, P.J., & Bollen, K.A. (2001). The best of both worlds: Combining autoregressive and latent curve models. In L.M. Collins & A.G. Sayer (Eds.), New methods for the analysis of change (pp. 105-136). Washington, DC: American Psychological Association.

Bollen, K.A., & Curran, P.J. (2004). Autoregressive latent trajectory (ALT) models: A synthesis of two traditions. Sociological Methods and Research, 32, 336-383.

Delsing, M.J.M.H., & Oud, J.H.L. (2008). Analyzing reciprocal relationships by means of the continuous-time autoregressive latent trajectory model. Statistica Neerlandica, 62, 58-82.

Oud, J.H.L. (2002). Continuous time modeling of the cross-lagged panel design. Kwantitatieve Methoden 69, 1-26.

Hamaker, E.L. (2005). Conditions for the equivalence of the autoregressive latent trajectory model and a latent growth curve model with autoregressive disturbances. Sociological Methods and Research, 33, 404-416.

Voelkle, M. C. (2008). Reconsidering the use of autoregressive latent trajectory (ALT) models. Multivariate Behavioral Research, 43,564-591.