[R] how to tell if its better to standardize your data matrix first when you do principal
Uwe Ligges
ligges at statistik.tu-dortmund.de
Sun Nov 22 16:22:02 CET 2009
masterinex wrote:
>
>
> Hi guys ,
>
> Im trying to do principal component analysis in R . There is 2 ways of doing
> it , I believe.
> One is doing principal component analysis right away the other way is
> standardizing the matrix first using s = scale(m)and then apply principal
> component analysis.
> How do I tell what result is better ? What values in particular should i
> look at . I already managed to find the eigenvalues and eigenvectors , the
> proportion of variance for each eigenvector using both methods.
>
Generally, it is better to standardize. But in some cases, e.g. for the
same units in your variables indicating also the importance, it might
make sense not to do so.
You should think about the analysis, you cannot know which result is
`better' unless you know an interpretation.
> I noticed that the proportion of the variance for the first pca without
> standardizing had a larger value . Is there a meaning to it ? Isnt this
> always the case?
> At last , if I am supposed to predict a variable ie weight should I drop
> the variable ie weight from my data matrix when I do principal component
> analysis ?
This sounds a bit like homework. If that is the case, please ask your
teacher rather than this list.
Anyway, it does not make sense to predict weight using a linear
combination (principle component) that contains weight, does it?
Uwe Ligges
More information about the R-help
mailing list