[R] how to tell if its better to standardize your data matrix first when you do principal

Michael Kubovy kubovy at virginia.edu
Tue Nov 24 03:06:16 CET 2009


On Nov 22, 2009, at 10:22 AM, Uwe Ligges wrote:

> masterinex wrote:
>> Hi guys , Im trying to do principal component analysis in R . There is 2 ways of doing
>> it , I believe. One is doing  principal component analysis right away the other way is standardizing the matrix first  using s = scale(m)and then apply principal
>> component analysis.   How  do I tell what result is better ? What values in particular should i
>> look at . I already managed to find the eigenvalues and eigenvectors , the
>> proportion of  variance for each eigenvector using both methods.
> 
> Generally, it is better to standardize. But in some cases, e.g. for the same units in your variables indicating also the importance, it might make sense not to do so.
> You should think about the analysis, you cannot know which result is `better' unless you know an interpretation.
> 
> 
> 
>> I noticed that the proportion of the variance for the first  pca without
>> standardizing had a larger  value . Is there a meaning to it ? Isnt this
>> always the case?
>> At last , if I am  supposed to predict a variable ie weight should I drop
>> the variable ie weight from my data matrix when I do principal component
>> analysis ?
> 
> 
> This sounds a bit like homework. If that is the case, please ask your teacher rather than this list.
> Anyway, it does not make sense to predict weight using a linear combination (principle component) that contains weight, does it?
> 
> Uwe Ligges

It's likely to have been homework: A quick search on "masterinex" "xevilgang79" reveal which university this undergraduate student is at. It also produces a phone number, which can be used to lookup an address, and a cell phone number.

MK



More information about the R-help mailing list