[R] prcomp(X,center=F) ??
Agustin Lobo
aloboaleu at gmail.com
Sun Mar 8 10:07:49 CET 2009
I do not understand, from a PCA point of view, the option center=F
of prcomp()
According to the help page, the calculation in prcomp() "is done by a
singular value decomposition of the (centered and possibly scaled) data
matrix, not by using eigen on the covariance matrix" (as it's done by
princomp()) .
"This is generally the preferred method for numerical accuracy"
The question is that while
prcomp(X,center=T,scaling=F) is equivalent to princomp(X,scaling=F),
but prcomp(X,center=F) has no equivalent in princomp()
Also, the rotation made with either the eigenvectors of
prcomp(X,center=T,scaling=F) or the ones of princomp(X,scaling=F)
yields PCs with a minimum correlation, as expected
for a PCA. But the rotation made with the eigenvectors of
prcomp(X,center=F) yields axes that are correlated.
Therefore, prcomp(X,center=F) is not really a PCA.
See the following example, in which the second column of
data matrix X is linearly correlated to the first column:
> X <- cbind(rnorm(100,100,50),rnorm(100,100,50))
> X[,2] <- X[,1]*1.5-50 +runif(100,-70,70)
> plot(X)
> cor(X[,1],X[,2])
[1] 0.903597
> eigvnocent <- prcomp(X,center=F,scaling=F)[[1]]
> eigvcent <- prcomp(X,center=T,scaling=F)[[1]]
> eigvecnocent <- prcomp(X,center=F,scaling=F)[[2]]
> eigveccent <- prcomp(X,center=T,scaling=F)[[2]]
> PCnocent <- X%*%eigvecnocent
> PCcent <- X%*%eigveccent
> par(mfrow=c(2,2))
> plot(X)
> plot(PCnocent)
> plot(PCcent)
> cor(X[,1],X[,2])
[1] 0.903597
> cor(PCcent[,1],PCcent[,2])
[1] -8.778818e-16
> cor(PCnocent[,1],PCnocent[,2])
[1] -0.6908334
>
Also the help page of prcomp() states:
"Details
The calculation is done by a singular value decomposition of the
(centered and possibly scaled) data matrix..."
The parenthesis implies some ambiguity, but I do interpret the sentence
as indicating that the calculation should always be done using a
centered data matrix.
Finally, all the examples in the help page use centering (or scaling,
which implies centering)
Therefore, why the option center=F ?
Agus
Agus
--
Dr. Agustin Lobo
Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
LLuis Sole Sabaris s/n
08028 Barcelona
Spain
Tel. 34 934095410
Fax. 34 934110012
email: Agustin.Lobo at ija.csic.es
http://www.ija.csic.es/gt/obster
More information about the R-help
mailing list