[R] factanal AIC?

Lucke, Joseph F Joseph.F.Lucke at uth.tmc.edu
Mon May 7 18:16:48 CEST 2007


The log likelihood is

const + n/2* [  log(det(Sigma^-1)) - trace(Sigma^-1*S) ]  where Sigma is the population covariance matrix 

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Spencer Graves
Sent: Friday, May 04, 2007 9:20 PM
To: R-devel mailing list
Cc: Jens Oehlschlägel; r-help at stat.math.ethz.ch
Subject: Re: [R] factanal AIC?

Dear R Developers: 

      What is the proper log(likelihood) for 'factanal'?  I believe it should be something like the following: 

      (-n/2)*(k*(log(2*pi)+1)+log(det(S)))

or without k*(log(2*pi)-1): 

      (-n/2)*log(det(S)),

where n = the number of (multivariate) observations. 

      The 'factanal' help file say the factanal component "criteria:  
The results of the optimization: the value of the negative log-likelihood and information on the iterations used." 

      However, I'm not able to get this.  If it's a log(likelihood), then replacing a data set m1 by rbind(m1, m1) should double the log(likelihood).  However it has no impact on it.  Consider the following modification of the first example in the 'factanal' help page: 

 >    v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
 >      v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
 >      v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
 >      v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
 >      v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
 >      v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
 >      m1 <- cbind(v1,v2,v3,v4,v5,v6)
 > tst <- factanal(m1, factors=3)
 > fit1 <- factanal(m1, factors=3)
 > fit2 <- factanal(rbind(m1, m1), factors=3)  > fit1$criteria["objective"] objective
0.4755156
 > fit2$criteria["objective"]
objective
0.4755156

      From the following example, it seems that the "k*(log(2*pi)-1) term is omitted: 

 > x2 <- c(-1,1)
 > X2.4 <- as.matrix(expand.grid(x2, x2, x2, x2))  > factanal(X2.4, 1)$criteria
      objective counts.function counts.gradient
              0               7               7

      However, I can't get the 'fit1$criteria' above, even ignoring the sample size.  Consider the following: 

      > S3 <- tcrossprod(fit1$loadings)+diag(fit1$uniquenesses)
 > log(det(S3))
[1] -6.725685

      Shouldn't this be something closer to the 0.4755 for fit1$criteria? 

      Thanks,
      Spencer Graves

JENS:  For your purposes, I suggest you try to compute (-n/2)*log(det(tcrossprod(fit$loadings)+diag(fit$uniquenesses)).  If you do this, you might get more sensible results with your examples. 

      Hope this helps. 
      Spencer Graves

Jens Oehlschlägel wrote:
> Dear list members,
>
> Could any expert on factor analysis be so kind to explain how to calculate AIC on the output of factanal. Do I calculate AIC wrong or is factanal$criteria["objective"] not a negative log-likelihood?
>
> Best regards
>
>
> Jens Oehlschlägel
>
>
>
> The AIC calculated using summary.factanal below don't appear correct to me:
>
>       n items factors total.df rest.df model.df           LL       AIC      AICc       BIC
> 1  1000    20       1      210     170       40 -5.192975386  90.38595  93.80618  286.6962
> 2  1000    20       2      210     151       59 -3.474172303 124.94834 132.48026  414.5059
> 3  1000    20       3      210     133       77 -1.745821627 157.49164 170.51984  535.3888
> 4  1000    20       4      210     116       94 -0.120505369 188.24101 207.97582  649.5700
> 5  1000    20       5      210     100      110 -0.099209921 220.19842 247.66749  760.0515
> 6  1000    20       6      210      85      125 -0.072272695 250.14455 286.18574  863.6140
> 7  1000    20       7      210      71      139 -0.054668588 278.10934 323.36515  960.2873
> 8  1000    20       8      210      58      152 -0.041708051 304.08342 358.99723 1050.0622
> 9  1000    20       9      210      46      164 -0.028686298 328.05737 392.87174 1132.9292
> 10 1000    20      10      210      35      175 -0.015742783 350.03149 424.78877 1208.8887
> 11 1000    20      11      210      25      185 -0.007007901 370.01402 454.55947 1277.9487
> 12 1000    20      12      210      16      194 -0.005090725 388.01018 481.99776 1340.1147
>
>
> summary.factanal <- function(object, ...){
>   if (inherits(object, "try-error")){
>     c(n=NA, items=NA, factors=NA, total.df=NA, rest.df=NA, model.df=NA, LL=NA, AIC=NA, AICc=NA, BIC=NA)
>   }else{
>     n <- object$n.obs
>     p <- length(object$uniquenesses)
>     m <- object$factors
>     model.df <- (p*m) + (m*(m+1))/2 + p - m^2
>     total.df <- p*(p+1)/2
>     rest.df <- total.df - model.df # = object$dof
>     LL <- -as.vector(object$criteria["objective"])
>     k <- model.df
>     aic <- 2*k - 2*LL
>     aicc <- aic + (2*k*(k+1))/(n-k-1)
>     bic  <- k*log(n) - 2*LL
>     c(n=n, items=p, factors=m, total.df=total.df, rest.df=rest.df, model.df=model.df, LL=LL, AIC=aic, AICc=aicc, BIC=bic)
>   }
> }
>
> multifactanal <- function(factors=1:3, ...){
>   names(factors) <- factors
>   ret <- lapply(factors, function(factors){
>     try(factanal(factors=factors, ...))
>   })
>   class(ret) <- "multifactanal"
>   ret
> }
>
> summary.multifactanal <- function(object,...){
>   do.call("rbind", lapply(object, summary.factanal)) }
>
> print.multifactanal <- function(x,...){
>   ret <- summary.multifactanal(x)
>   print(ret, ...)
>   invisible(ret)
> }
>
> # simulate a true 4-factor model
> n <- 1000
> ktrue <- 4
> kfac <- 5
> true <- matrix(rnorm(n*ktrue), ncol=ktrue) x <- matrix(rep(true, 
> kfac)+rnorm(n*ktrue*kfac), ncol=ktrue*kfac)
> dimnames(x) <- list(NULL, paste(rep(letters[1:ktrue], kfac), 
> rep(1:kfac, rep(ktrue, kfac)), sep="")) covmat <- cov.wt(x) # run 
> factanal for several numbers of factors mf <- 
> multifactanal(factors=1:12, covmat=covmat) mf
>
>
> version
>                _                           
> platform       i386-pc-mingw32             
> arch           i386                        
> os             mingw32                     
> system         i386, mingw32               
> status                                     
> major          2                           
> minor          5.0                         
> year           2007                        
> month          04                          
> day            23                          
> svn rev        41293                       
> language       R                           
> version.string R version 2.5.0 (2007-04-23)
>

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list