[Rd] [R] factanal AIC?
Spencer Graves
spencer.graves at pdf.com
Sat May 5 04:20:04 CEST 2007
Dear R Developers:
What is the proper log(likelihood) for 'factanal'? I believe it
should be something like the following:
(-n/2)*(k*(log(2*pi)+1)+log(det(S)))
or without k*(log(2*pi)-1):
(-n/2)*log(det(S)),
where n = the number of (multivariate) observations.
The 'factanal' help file say the factanal component "criteria:
The results of the optimization: the value of the negative
log-likelihood and information on the iterations used."
However, I'm not able to get this. If it's a log(likelihood),
then replacing a data set m1 by rbind(m1, m1) should double the
log(likelihood). However it has no impact on it. Consider the
following modification of the first example in the 'factanal' help page:
> v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
> v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
> v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
> v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
> v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
> v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
> m1 <- cbind(v1,v2,v3,v4,v5,v6)
> tst <- factanal(m1, factors=3)
> fit1 <- factanal(m1, factors=3)
> fit2 <- factanal(rbind(m1, m1), factors=3)
> fit1$criteria["objective"]
objective
0.4755156
> fit2$criteria["objective"]
objective
0.4755156
From the following example, it seems that the "k*(log(2*pi)-1)
term is omitted:
> x2 <- c(-1,1)
> X2.4 <- as.matrix(expand.grid(x2, x2, x2, x2))
> factanal(X2.4, 1)$criteria
objective counts.function counts.gradient
0 7 7
However, I can't get the 'fit1$criteria' above, even ignoring the
sample size. Consider the following:
> S3 <- tcrossprod(fit1$loadings)+diag(fit1$uniquenesses)
> log(det(S3))
[1] -6.725685
Shouldn't this be something closer to the 0.4755 for fit1$criteria?
Thanks,
Spencer Graves
JENS: For your purposes, I suggest you try to compute
(-n/2)*log(det(tcrossprod(fit$loadings)+diag(fit$uniquenesses)). If you
do this, you might get more sensible results with your examples.
Hope this helps.
Spencer Graves
Jens Oehlschlägel wrote:
> Dear list members,
>
> Could any expert on factor analysis be so kind to explain how to calculate AIC on the output of factanal. Do I calculate AIC wrong or is factanal$criteria["objective"] not a negative log-likelihood?
>
> Best regards
>
>
> Jens Oehlschlägel
>
>
>
> The AIC calculated using summary.factanal below don't appear correct to me:
>
> n items factors total.df rest.df model.df LL AIC AICc BIC
> 1 1000 20 1 210 170 40 -5.192975386 90.38595 93.80618 286.6962
> 2 1000 20 2 210 151 59 -3.474172303 124.94834 132.48026 414.5059
> 3 1000 20 3 210 133 77 -1.745821627 157.49164 170.51984 535.3888
> 4 1000 20 4 210 116 94 -0.120505369 188.24101 207.97582 649.5700
> 5 1000 20 5 210 100 110 -0.099209921 220.19842 247.66749 760.0515
> 6 1000 20 6 210 85 125 -0.072272695 250.14455 286.18574 863.6140
> 7 1000 20 7 210 71 139 -0.054668588 278.10934 323.36515 960.2873
> 8 1000 20 8 210 58 152 -0.041708051 304.08342 358.99723 1050.0622
> 9 1000 20 9 210 46 164 -0.028686298 328.05737 392.87174 1132.9292
> 10 1000 20 10 210 35 175 -0.015742783 350.03149 424.78877 1208.8887
> 11 1000 20 11 210 25 185 -0.007007901 370.01402 454.55947 1277.9487
> 12 1000 20 12 210 16 194 -0.005090725 388.01018 481.99776 1340.1147
>
>
> summary.factanal <- function(object, ...){
> if (inherits(object, "try-error")){
> c(n=NA, items=NA, factors=NA, total.df=NA, rest.df=NA, model.df=NA, LL=NA, AIC=NA, AICc=NA, BIC=NA)
> }else{
> n <- object$n.obs
> p <- length(object$uniquenesses)
> m <- object$factors
> model.df <- (p*m) + (m*(m+1))/2 + p - m^2
> total.df <- p*(p+1)/2
> rest.df <- total.df - model.df # = object$dof
> LL <- -as.vector(object$criteria["objective"])
> k <- model.df
> aic <- 2*k - 2*LL
> aicc <- aic + (2*k*(k+1))/(n-k-1)
> bic <- k*log(n) - 2*LL
> c(n=n, items=p, factors=m, total.df=total.df, rest.df=rest.df, model.df=model.df, LL=LL, AIC=aic, AICc=aicc, BIC=bic)
> }
> }
>
> multifactanal <- function(factors=1:3, ...){
> names(factors) <- factors
> ret <- lapply(factors, function(factors){
> try(factanal(factors=factors, ...))
> })
> class(ret) <- "multifactanal"
> ret
> }
>
> summary.multifactanal <- function(object,...){
> do.call("rbind", lapply(object, summary.factanal))
> }
>
> print.multifactanal <- function(x,...){
> ret <- summary.multifactanal(x)
> print(ret, ...)
> invisible(ret)
> }
>
> # simulate a true 4-factor model
> n <- 1000
> ktrue <- 4
> kfac <- 5
> true <- matrix(rnorm(n*ktrue), ncol=ktrue)
> x <- matrix(rep(true, kfac)+rnorm(n*ktrue*kfac), ncol=ktrue*kfac)
> dimnames(x) <- list(NULL, paste(rep(letters[1:ktrue], kfac), rep(1:kfac, rep(ktrue, kfac)), sep=""))
> covmat <- cov.wt(x)
> # run factanal for several numbers of factors
> mf <- multifactanal(factors=1:12, covmat=covmat)
> mf
>
>
> version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 5.0
> year 2007
> month 04
> day 23
> svn rev 41293
> language R
> version.string R version 2.5.0 (2007-04-23)
>
More information about the R-devel
mailing list