[R] Specifying Path Model in SEM for CFA

Wed Aug 16 14:47:28 CEST 2006

Dear Rick,

There are a couple of problems here:

(1) You've fixed the error variance parameters for each of the observed
variables to 1 rather than defining each as a free parameter to estimate.
For example, use 

X1 <-> X1, theta1, NA

Rather than 

X1 <-> X1, NA, 1

The general principle is that if you give a parameter a name, it's a free
parameter to be estimated; if you give the name as NA, then the parameter is
given a fixed value (here, 1). (There is some more information on this and
on error-variance parameters in ?sem.)

(2) I believe that the model you're trying to specify -- in which all
variables but X6 load on F1, and all variables but X1 load on F2 -- is
underidentified.

In addition, you've set the metric of the factors by fixing one loading to
0.20 and another to 0.25. That should work but strikes me as unusual, and
makes me wonder whether this was what you really intended. It would be more
common in a CFA to fix the variance of each factor to 1, and let the factor
loadings be free parameters. Then the factor covariance would be their
correlation. 

You should not have to specify start values for free parameters (such as
g11, g22, and g12 in your model), though it is not wrong to do so. I would
not, however, specify start values that imply a singular covariance matrix
among the factors, as you've done; I'm surprised that the program was able
to get by the start values to produce a solution.

BTW, the Thurstone example in ?sem is for a confirmatory factor analysis
(albeit a slightly more complicated one with a second-order factor). There's
also an example of a one-factor CFA in the paper at
<http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf>, though this
is for ordinal observed variables.

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Rick Bilonick
> Sent: Tuesday, August 15, 2006 11:50 PM
> To: R Help
> Subject: [R] Specifying Path Model in SEM for CFA
> 
> I'm using specify.model for the sem package. I can't figure 
> out how to represent the residual errors for the observed 
> variables for a CFA model. (Once I get this working I need to 
> add some further constraints.)
> 
> Here is what I've tried:
> 
> model.sa <- specify.model()
>   F1	 -> X1,l11, NA
>   F1	 -> X2,l21, NA
>   F1	 -> X3,l31, NA
>   F1	 -> X4,l41, NA
>   F1	 -> X5, NA, 0.20
>   F2	 -> X1,l12, NA
>   F2	 -> X2,l22, NA
>   F2	 -> X3,l32, NA
>   F2	 -> X4,l42, NA
>   F2	 -> X6, NA, 0.25
>   F1	<-> F2,g12, 1
>   F1    <-> F1,g11, 1
>   F2    <-> F2,g22, 1
>   X1	<-> X1, NA, 1
>   X2	<-> X2, NA, 1
>   X3	<-> X3, NA, 1
>   X4	<-> X4, NA, 1
>   X5	<-> X5, NA, 1
>   X6	<-> X6, NA, 1
> 
> This at least converges:
> 
> > summary(fit.sem)
> 
>  Model Chisquare =  2147   Df =  10 Pr(>Chisq) = 0
>  Chisquare (null model) =  2934   Df =  15
>  Goodness-of-fit index =  0.4822
>  Adjusted goodness-of-fit index =  -0.087387
>  RMSEA index =  0.66107   90 % CI: (NA, NA)
>  Bentler-Bonnett NFI =  0.26823
>  Tucker-Lewis NNFI =  -0.098156
>  Bentler CFI =  0.26790
>  BIC =  2085.1
> 
>  Normalized Residuals
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>  -5.990  -0.618   0.192   0.165   1.700   3.950
> 
>  Parameter Estimates
>     Estimate  Std Error z value  Pr(>|z|)
> l11 -0.245981 0.21863   -1.12510 0.26054748 X1 <--- F1
> l21 -0.308249 0.22573   -1.36555 0.17207875 X2 <--- F1
> l31  0.202590 0.07910    2.56118 0.01043175 X3 <--- F1
> l41 -0.235156 0.21980   -1.06985 0.28468885 X4 <--- F1
> l12  0.839985 0.21962    3.82476 0.00013090 X1 <--- F2
> l22  0.828460 0.22548    3.67418 0.00023862 X2 <--- F2
> l32  0.066722 0.08369    0.79725 0.42530606 X3 <--- F2
> l42  0.832037 0.21840    3.80963 0.00013917 X4 <--- F2
> g12  0.936719 0.64331    1.45609 0.14536647 F2 <--> F1
> g11  2.567669 1.25608    2.04418 0.04093528 F1 <--> F1
> g22  1.208497 0.55040    2.19567 0.02811527 F2 <--> F2
> 
>  Iterations =  59
> 
> And it produces the following path diagram:
> 
> > path.diagram(fit.sem)
> digraph "fit.sem" {
>   rankdir=LR;
>   size="8,8";
>   node [fontname="Helvetica" fontsize=14 shape=box];
>   edge [fontname="Helvetica" fontsize=10];
>   center=1;
>   "F2" [shape=ellipse]
>   "F1" [shape=ellipse]
>   "F1" -> "X1" [label="l11"];
>   "F1" -> "X2" [label="l21"];
>   "F1" -> "X3" [label="l31"];
>   "F1" -> "X4" [label="l41"];
>   "F1" -> "X5" [label=""];
>   "F2" -> "X1" [label="l12"];
>   "F2" -> "X2" [label="l22"];
>   "F2" -> "X3" [label="l32"];
>   "F2" -> "X4" [label="l42"];
>   "F2" -> "X6" [label=""];
> }
> 
> But I don't see the residual error terms that go into each of 
> the observed variables X1 - X6. I've tried:
> 
> model.sa <- specify.model()
>   E1	 -> X1, e1,  1
>   E2	 -> X2, e2,  1
>   E3	 -> X3, e3,  1
>   E4	 -> X4, e4,  1
>   E5	 -> X5, e5,  1
>   E6	 -> X6, e6,  1
>   E1	<-> E1, s1, NA
>   E2	<-> E2, s2, NA
>   E3	<-> E3, s3, NA
>   E4	<-> E4, s4, NA
>   E5	<-> E5, s5, NA
>   E6	<-> E6, s6, NA
>   F1	 -> X1,l11, NA
>   F1	 -> X2,l21, NA
>   F1	 -> X3,l31, NA
>   F1	 -> X4,l41, NA
>   F1	 -> X5, NA,  1
>   F2	 -> X1,l12, NA
>   F2	 -> X2,l22, NA
>   F2	 -> X3,l32, NA
>   F2	 -> X4,l42, NA
>   F2	 -> X6, NA,  1
>   F1	<-> F2, NA, 1
>   F1    <-> F1, NA, 1
>   F2    <-> F2,g22, NA
>   X1	<-> X1, NA, 1
>   X2	<-> X2, NA, 1
>   X3	<-> X3, NA, 1
>   X4	<-> X4, NA, 1
>   X5	<-> X5, NA, 1
>   X6	<-> X6, NA, 1
> 
> I'm trying to use E1 - E6 as the residual error terms. But I 
> get warning messages about no variances for X1-X6 and it 
> doesn't converge. Also, the associated path diagram:
> 
> digraph "fit.sem" {
>   rankdir=LR;
>   size="8,8";
>   node [fontname="Helvetica" fontsize=14 shape=box];
>   edge [fontname="Helvetica" fontsize=10];
>   center=1;
>   "E1" [shape=ellipse]
>   "E2" [shape=ellipse]
>   "E3" [shape=ellipse]
>   "E4" [shape=ellipse]
>   "E5" [shape=ellipse]
>   "E6" [shape=ellipse]
>   "F2" [shape=ellipse]
>   "F1" [shape=ellipse]
>   "E1" -> "X1" [label=""];
>   "E2" -> "X2" [label=""];
>   "E3" -> "X3" [label=""];
>   "E4" -> "X4" [label=""];
>   "E5" -> "X5" [label=""];
>   "E6" -> "X6" [label=""];
>   "F1" -> "X1" [label="l11"];
>   "F1" -> "X2" [label="l21"];
>   "F1" -> "X3" [label="l31"];
>   "F1" -> "X4" [label="l41"];
>   "F1" -> "X5" [label=""];
>   "F2" -> "X1" [label="l12"];
>   "F2" -> "X2" [label="l22"];
>   "F2" -> "X3" [label="l32"];
>   "F2" -> "X4" [label="l42"];
>   "F2" -> "X6" [label=""];
> }
> 
> Has ellipses around the E1-E6 which I believe indicates they 
> are latent factors and not residual errors.
> 
> If anyone could point in the right direction I would appreciate it.
> 
> Rick B.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.