[R] CFA with lavaan or with SEM

Wed Jan 23 17:10:40 CET 2013

Dear David,

It certainly helps to have a "reproducible example."

You've left out the error variances ("uniquenesses") for the observed
variables. You're also making the specification *much* harder than it needs
to be:

---------- snip -----------
> cfa.mod.1 <- cfa()
1: F: a, b, c, d, e, g
2: 
Read 1 item
NOTE: adding 6 variances to the model

> cfa.mod.1
   Path    Parameter StartValue
1  F -> a  <fixed>   1         
2  F -> b  lam[b:F]            
3  F -> c  lam[c:F]            
4  F -> d  lam[d:F]            
5  F -> e  lam[e:F]            
6  F -> g  lam[g:F]            
7  F <-> F V[F]                
8  a <-> a V[a]                
9  b <-> b V[b]                
10 c <-> c V[c]                
11 d <-> d V[d]                
12 e <-> e V[e]                
13 g <-> g V[g]               

> cfa.sem.1 <- sem(cfa.mod.1, S=my.cor, N=200)
> summary(cfa.sem.1)

 Model Chisquare =  543.6442   Df =  9 Pr(>Chisq) = 2.565155e-111
 AIC =  567.6442
 BIC =  495.9594

 Normalized Residuals
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-1.536000 -0.135500  0.002829  0.294500  0.353400  5.337000 

 R-square for Endogenous Variables
     a      b      c      d      e      g 
0.1841 0.6969 0.8172 1.0084 0.8269 0.6007 

 Parameter Estimates
         Estimate     Std Error   z value   Pr(>|z|)             
lam[b:F]  1.945376727 0.302785547  6.424933 1.319280e-10 b <--- F
lam[c:F]  2.106647980 0.320689035  6.569130 5.061006e-11 c <--- F
lam[d:F]  2.340103148 0.347900207  6.726363 1.739560e-11 d <--- F
lam[e:F]  2.119171567 0.322095480  6.579327 4.725816e-11 e <--- F
lam[g:F]  1.806192591 0.287680436  6.278469 3.419240e-10 g <--- F
V[F]      0.184137740 0.057758730  3.188050 1.432356e-03 F <--> F
V[a]      0.815862342 0.081641551  9.993224 1.631854e-23 a <--> a
V[b]      0.303132223 0.030545714  9.923887 3.277381e-23 b <--> b
V[c]      0.182802929 0.019248279  9.497105 2.158058e-21 c <--> c
V[d]     -0.008353614 0.008298643 -1.006624 3.141154e-01 d <--> d
V[e]      0.173057855 0.018375461  9.417878 4.602950e-21 e <--> e
V[g]      0.399281457 0.039935977  9.998039 1.554445e-23 g <--> g

 Iterations =  59 

---------- snip -----------

Note that the default in cfa() is to use a reference indicator, and that the
solution is improper -- there's a negative estimated error variance, V[d]. 

An alternative specification sets the variance of the factor to 1, but then
cfa() fails to converge:

---------- snip -----------

> cfa.mod.2 <- cfa(reference.indicators=FALSE)
1: F: a, b, c, d, e, g
2: 
Read 1 item
NOTE: adding 6 variances to the model

> cfa.mod.2
   Path    Parameter StartValue
1  F -> a  lam[a:F]            
2  F -> b  lam[b:F]            
3  F -> c  lam[c:F]            
4  F -> d  lam[d:F]            
5  F -> e  lam[e:F]            
6  F -> g  lam[g:F]            
7  F <-> F <fixed>   1         
8  a <-> a V[a]                
9  b <-> b V[b]                
10 c <-> c V[c]                
11 d <-> d V[d]                
12 e <-> e V[e]                
13 g <-> g V[g]                

> cfa.sem.2 <- sem(cfa.mod.2, S=my.cor, N=200, debug=TRUE)

. . .

Start values:
  lam[a:F]   lam[b:F]   lam[c:F]   lam[d:F]   lam[e:F]   lam[g:F]       V[a]
V[b]       V[c]       V[d]       V[e]       V[g] 
0.65781335 0.87500031 0.89597921 0.95169707 0.87357655 0.86645865 0.56728160
0.23437445 0.19722125 0.09427268 0.23686401 0.24924941 

iteration = 0
Step:
 [1] 0 0 0 0 0 0 0 0 0 0 0 0
Parameter:
 [1] 0.65781335 0.87500031 0.89597921 0.95169707 0.87357655 0.86645865
0.56728160 0.23437445 0.19722125 0.09427268 0.23686401 0.24924941
Function Value
[1] 3.346898
Gradient:
 [1]  0.4583916  0.3957443 -0.2067868 -0.4369468 -0.2629929  0.2431501
-0.5501220 -1.6700002  0.6543088  3.0031327  0.7820309 -1.0122023

. . .

iteration = 21
Parameter:
 [1]  0.44280000  0.68987016  0.99055402  1.15651371  0.99812990  0.75293242
0.82441291  1.01174284  0.01185904 -1.30253783 -0.01183159
[12]  0.71942353
Function Value
[1] -316143
Gradient:
 [1]      83431722     105921661   12975044375    -137927630  -13105242109
162575760     -22404848     -36111801 -541872735153
[10]     -61232522 -552802111412     -85072888

Successive iterates within tolerance.
Current iterate is probably solution.

Warning message:
In eval(expr, envir, enclos) :
  Could not compute QR decomposition of Hessian.
Optimization probably did not converge.

---------- snip -----------

The problem seems ill-conditioned, and in any event the standard errors that
you get using tetrachoric correlations won't be right (I expect you know
that).

I hope this helps,
 John

> -----Original Message-----
> From: David Purves [mailto:David.Purves at glasgow.ac.uk]
> Sent: Wednesday, January 23, 2013 10:23 AM
> To: John Fox
> Cc: r-help at R-project.org
> Subject: RE: [R] CFA with lavaan or with SEM
> 
> Hi John
> 
> Thanks for your quick reply.
> 
> The full warning I got is
> 
> ' Error in csem(model = model.description, start, opt.flag = 1, typsize
> = typsize,  :
>   The matrix is non-invertable.'
> 
> The eigenvalues of the tetrachoric correlations are non negative. So it
> is must be how I am defining my model.
> 
> I have also tried it without having lavaan in the session.
> 
> A wee example of my error (whether it is sensible);
> 
> library(sem)
> 
> my.cor<-matrix(c( 1.0000000  ,  0.7600616  ,  0.3653309 ,   0.4377949 ,
> 0.2917927 ,   0.5133697,
>     0.7600616 ,   1.0000000,   0.6335519 ,   0.8288809 , 0.6223942  ,
> 0.6355725,
>      0.3653309 ,  0.6335519  ,  1.0000000 ,   0.9098309 , 0.9098309  ,
> 0.7693395,
>      0.4377949 , 0.8288809  ,  0.9098309  ,  1.0000000  ,0.9136967   ,
> 0.7829854,
>       0.2917927  ,0.6223942  ,  0.9098309  ,  0.9136967  ,1.0000000   ,
> 0.7354562,
>      0.5133697  ,0.6355725  ,  0.7693395  ,  0.7829854 , 0.7354562   ,
> 1.0000000),
>         nrow=6,byrow=T)
> 
> colnames(my.cor)<-rownames(my.cor)<-c("a","b","c","d","e","g")
> 
> eigen(my.cor)
> solve(my.cor)
> 
> #i tried defining the model in two ways
> 
>         model.1<-matrix(c(
>         #       arrow           #parameter              #start
>                 "f -> a",       "g1",                   NA,
>                 "f -> b",       "g2",                   NA,
>                 "f -> c",       "g3",                   NA,
>                 "f -> d",       "g4",                   NA,
>                 "f -> e",       "g5",                   NA,
>                 "f -> g",       "g6",                   NA,
>                 "f <-> f",      NA,                     1),
>                 ncol=3,byrow=T)
> 
> out<-sem(model.1,S=my.cor,200)
> 
> model.1 <- specifyEquations()
>  f1 = gam11*a + gam12*b + gam13*c + gam14*d + gam15*e + gam16*g
>  f1 = 1* f1
> 
> out<-sem(model.1,S=my.cor,200)
> 
> But the same error.
> 
> I would be very grateful if you could indicate where the error in my
> code is please.
> 
> 
> thanks, david
> 
> 
> 
> 
> -----Original Message-----
> From: John Fox [mailto:jfox at mcmaster.ca]
> Sent: 23 January 2013 14:00
> To: David Purves
> Cc: r-help at R-project.org
> Subject: Re: [R] CFA with lavaan or with SEM
> 
> Dear David,
> 
> On Wed, 23 Jan 2013 11:19:09 +0000
>  David Purves <David.Purves at glasgow.ac.uk> wrote:
> > Hi
> >
> > Sorry for the rather long message.
> >
> 
> . . .
> 
> >
> > I have tried the analysis using John Fox's SEM package / command.
> >
> > I calculate the correlation matrix with smoothing
> >
> > my.cor<-hetcor(north.dat.sub,use="pairwise.complete.obs")$correlations
> >
> > This returns the warning indicating that the correlation matrix was
> adjusted to make it positive definite. However the following sem model
> does not run, with the error message that the matrix is non-invertible.
> >
> > mod1<-sem::sem(sem .model.1, S=my.cor, 300)
> >
> > Should the smoothing not allow it to be inverted?
> >
> 
> If the input correlation matrix is really positive definite, then it has
> an inverse. You could check directly, e.g., by looking at the
> eignevalues of the tetrachoric correlation matrix. There's very little
> here to go on, not even the error message produced by sem(). By the way,
> I assume that you didn't really call sem in the sem package as sem::sem
> in a session in which lavann was loaded. I'm not sure what would happen
> if you did that.
> 
> Best,
>  John
> 
> ------------------------------------------------
> John Fox
> Sen. William McMaster Prof. of Social Statistics Department of Sociology
> McMaster University Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
> 
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> 
> The University of Glasgow, charity number SC004401