Comparative Model Testing and Nested Models

As we've discussed, part of Assignment 2 requires you to engage in comparative model testing. Specifically, you will run your model both with and without directed paths from three university properties (public/private status, years of existence, and endowment [square-root transformed]) to their Undergraduate Quality (UQ).

The more parsimonious model is, of course, the one without the additional paths. To override the preference for parsimony, therefore, you will have to show that the additional paths, as a set, significantly reduce the overall model chi-square, thus improving model fit. As you move along in your careers, you may wish to adopt additional criteria, such as whether the reduction in chi-square appears substantively large in addition to being statistically significant, but for now, we'll use statistically significant change as our criterion.

You can display your results in a table, as follows:

--------------------------------------------------------------

Model....................................X2.............df....

--------------------------------------------------------------

Model w/ fewer parameters.....----............---...

Model w/ added parameters.....----............---...

--------------------------------------------------------------
Delta (change)..........................----............---...
--------------------------------------------------------------

The chi-square change score (the top chi-square minus the bottom one) can be treated like any other chi-square value and be referred to a chi-square table, with degrees of freedom equal to delta df (top df minus bottom df).

UPDATE, March 7, 2017: For the version of the universities model without the three paths listed above, the chi-square is 210.98 (31 df), whereas for the model that adds the three paths, chi-square = 169.04 (28 df).

UPDATE, March 11, 2012: Xiaohui photographed the explanation I diagrammed on the board, linking number of paths in a model, goodness of fit, chi-square, and degrees of freedom. A key point was to demonstrate that if one model has a higher chi-square than another model, it will also have a higher number of degrees of freedom. All of the green phrases go together: a model with fewer paths (which preserves a higher df) will have a poorer fit and thus a higher chi-square. The red terms represent the opposite of the green terms, and thus the red terms go together, as well: a model with more paths (which depletes the df) will lead to a better fit and thus a lower chi-square.




UPDATE, March 5, 2008: Kristina photographed the decision-tree I drew on the board, to augment our discussion of comparative model testing. Here it is (you can click on the image to enlarge it).


And now, back to our regular programming...

An important condition for being able to conduct comparative model tests is that the two models being compared to each other must possess the property of nestedness. Two models are nested if they can be converted from one to the other either by only adding parameters to one to obtain the other, or only removing parameters from one to obtain the other. By parameters, we mean anything that is freely estimated in SEM (e.g., structural paths, non-directional correlations). If you start with one model and convert it to a new, second model by both adding and substracting parameters from the initial model, the two models will not fulfill the criteria for nestedness and thus cannot be compared via the delta chi-square test.

The following two diagrams provide examples of nested and non-nested models.





An analogous situation exists in multiple regression. You can do a delta R-square test to see, for example, if a model with predictor set A, B, C, D, and E accounts for significantly more variance in the dependent variable than does predictor set A, B, and C. ABC is contained -- that is nested -- within ABCDE, thus permitting the statistical comparison. You could not, however, test whether predictor set ABCDF accounts for more variance than set ABCDE, because the change in models would have required both dropping a predictor and adding one. If ABCDE was the starting point, we would have dropped E and added F.

We'll use the following article to delve more deeply into comparative model testing:

Bryant, A. L., Schulenberg, J., Bachman, J. G., O'Malley, P. M., & Johnston, L. D. (2000). Understanding the links among school misbehavior, academic achievement, and cigarette use: A national panel study of adolescents. Prevention Science, 1, 71-87.

Negative Variances (Heywood Cases)

A problem specific to SEM (and factor-analytic models more generally), is that of negative residual variances. Variances, being squared entities (of a standard deviation), must be positive. Negative variances are known as "Heywood Cases." This webpage describes what a Heywood Case is and suggests a simple remedy. Additional discussion of Heywood Cases is available here.