[R] prcomp: where do sdev values come from?
David L Carlson
dcarlson at tamu.edu
Thu Jun 21 23:36:38 CEST 2012
This should help you figure it out:
getAnywhere("prcomp.default")
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Adams, Sky
> Sent: Wednesday, June 20, 2012 3:15 PM
> To: r-help at r-project.org
> Subject: [R] prcomp: where do sdev values come from?
>
> In the manual page for prcomp(), it says that sdev is "the standard
> deviations of the principal components (i.e., the square roots of the
> eigenvalues of the covariance/correlation matrix, though the
> calculation is actually done with the singular values of the data
> matrix)." However, this is not what I'm finding. The values appear
> to be the standard deviations of a reprojection of the data, as in the
> example below.
>
> >
> test=matrix(data=c(1,1,2,2,0,0,2,1,2,1,0,2,1,2,1,2,2,0),nrow=3,ncol=6)
> > test=standardize(test)
> > test
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] -0.7071068 2.828427 0.8944272 0 -0.7071068 1.414214
> [2,] -0.7071068 -1.414214 -1.7888544 -2 1.4142136 1.414214
> [3,] 1.4142136 -1.414214 0.8944272 2 -0.7071068 -2.828427
>
> > res1
> Standard deviations:
> [1] 3.622043e+00 2.877639e+00 1.812987e-16
>
> Rotation:
> PC1 PC2 PC3
> [1,] 0.3330598 -0.07347383 0.1418171
> [2,] -0.2319540 0.79957999 0.3837549
> [3,] 0.2745904 0.41276093 -0.8541541
> [4,] 0.5186794 0.23838204 0.2096053
> [5,] -0.2170828 -0.32631616 -0.1153013
> [6,] -0.6661196 0.14694766 -0.2140375
>
> > res2=svd(test)
> >res2
> $d
> [1] 5.122342e+00 4.069596e+00 1.153778e-15
>
> $u
> [,1] [,2] [,3]
> [1,] -0.2800491 0.7669675 0.5773503
> [2,] -0.5241888 -0.6260134 0.5773503
> [3,] 0.8042379 -0.1409541 0.5773503
>
> $v
> [,1] [,2] [,3]
> [1,] 0.3330598 -0.07347383 0.7608840
> [2,] -0.2319540 0.79957999 0.3438537
> [3,] 0.2745904 0.41276093 -0.4028185
> [4,] 0.5186794 0.23838204 0.1543444
> [5,] -0.2170828 -0.32631616 0.3236952
> [6,] -0.6661196 0.14694766 0.1093469
>
> Here, the values in $d are not the squares of the standard deviations.
>
> > res3=test %*% res2$v
> > res3
> [,1] [,2] [,3]
> [1,] -1.434507 3.1212479 -7.494005e-16
> [2,] -2.685074 -2.5476217 3.996803e-15
> [3,] 4.119582 -0.5736263 -2.053913e-15
> > apply(res3,2,sd)
> [1] 3.622043e+00 2.877639e+00 3.184320e-15
>
> As shown above, taking the standard deviation of the columns of the
> reprojection of the eigenvectors on the original "data" gives the same
> values as the standard deviations given by prcomp(). Can anyone
> reconcile this example with the manual page for prcomp() or explain
> how the standard deviations are calculated?
>
> the code for standardize():
>
> standardize<-function(data){
> numSamples = nrow(data);
>
> # calculate mean for each marker
> mn=apply(data,2,mean);
> numMarkers = ncol(data);
> #standardize every entry
> for(j in 1:numMarkers){
> for(k in 1:numSamples){
> temp = sqrt(mn[j]/2*(1-mn[j]/2));
> data[k,j] = (data[k,j] - mn[j])/temp;
> }
> }
> data;
> }
>
> Thank you,
> Sky
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list