# [R] covariance = diagonal + F'F

Spencer Graves spencer.graves at pdf.com
Sun Apr 20 21:59:38 CEST 2003

```	  Have you considered "factanal" in library(mva)?  Unless you want to
force "diagonal" to be a constant times the identity matrix, this is
essentially the model fit by factanal.

I have not followed the factor analysis literature for some years, so
I don't know what modern factor analysis gurus recommend to estimate "p"
= the number of factors, but if I had to have an answer this afternoon,
I would do the following:  First, I would fit models for p = 1, 2, and
3.  I would also try p = 0 and hope that would not bomb the software.

The documentation for "factanal" says that it has an attribute
"criterion", which is the negative of the log(likelihood).  I'd extract
those numbers as one column of a matrix.  In the second column, I'd put
the attribute "dof" = degrees of freedom in the model.  I'd then compute
the Akaike Information Criterion (AIC) and put that in the third column.

More precisely, I would compute the AIC with a finite sample correction,

AIC.c = (-2)*(log(likelihood)-K*(1+(K+1)/(n*m-K-1)),

where K = (dof+m);  this is a slight modification of AIC.c described by
Burnham and Anderson (2002) Model Selection and Multi-Model Inference,
2nd ed. (Springer).  The model with the smallest value for AIC.c is the
best.  Following Burnham and Anderson, I would then compute Del.AIC =
AIC.c - min(AIC.c) and put that in another column.  The next column
would then hold w.i = exp(-Del.AIC/2), and the final column would hold
posterior = w.i/sum(w.i).  This computation is an application of Bayes'
theorem with a uniform prior and with a correction for bias in
estimating the likelihood.

If this posterior does not sharply favor a value of p in the middle
of the range tested, I would test more values for p.  If I could not
easily get a value for p = 0 and the posterior strongly favored 1
without it, I would try to devlop another way to get a comparable number
for p = 0.

Hope this helps,
spencer graves

Thomas W Blackwell wrote:
>
> Use  svd()  or  eigen()  to get the eigenvectors and eigenvalues
> of a covariance matrix.  svd() gives them without calculating the
> matrix product first, so it is preferable numerically.  For your
> sum-of-matrices decomposition, I think you'll have to calculate
> the covariance matrix minus its diagonal first, and expect the
> result not to be positive definite.  Then calculate eigenvectors
> and eigenvalues of that and truncate the number retained.  "Best"
> is entirely subjective.
>
> I do not recall this sum decomposition from a previous thread.
>
> HTH  -  tom blackwell  -  u michigan medical school  -  ann arbor  -
>
> On Sun, 20 Apr 2003, Vadim Ogranovich wrote:
>
>
>>Dear R-Helpers,
>>
>>I have a n*m data matrix (n is the number of observations) and I want to
>>estimate its covariance matrix as a sum of a diagonal matrix and a low-rank
>>matrix F'F, where F is p*m matrix (sometimes called "factors"), p<<m, and F'
>>is F transpose.
>>
>>My questions are:
>>1. Given the number of factors p is there an R function that finds the best
>>F?
>>2. How to select the "best" p?
>>
>>Somehow I feel like this was already discussed on the list (but I couldn't
>>find it in the archives). If this is the case what was the subject of that