[R] how to tell if its better to standardize your data matrix first when you do principal

hadley wickham h.wickham at gmail.com
Sun Nov 22 22:35:19 CET 2009


You've asked the same question on stackoverflow.com and received the
same answer.  This is rude because it duplicates effort.  If you
urgently need a response to a question, perhaps you should consider
paying for it.

Hadley

On Sun, Nov 22, 2009 at 12:04 PM, masterinex <xevilgang79 at hotmail.com> wrote:
>
> so under which cases is it better to  standardize  the data matrix first ?
> also  is  PCA generally used to predict the response variable , should I
> keep that variable in my data matrix ?
>
>
> Uwe Ligges-3 wrote:
>>
>> masterinex wrote:
>>>
>>>
>>> Hi guys ,
>>>
>>> Im trying to do principal component analysis in R . There is 2 ways of
>>> doing
>>> it , I believe.
>>> One is doing  principal component analysis right away the other way is
>>> standardizing the matrix first  using s = scale(m)and then apply
>>> principal
>>> component analysis.
>>> How  do I tell what result is better ? What values in particular should i
>>> look at . I already managed to find the eigenvalues and eigenvectors ,
>>> the
>>> proportion of  variance for each eigenvector using both methods.
>>>
>>
>> Generally, it is better to standardize. But in some cases, e.g. for the
>> same units in your variables indicating also the importance, it might
>> make sense not to do so.
>> You should think about the analysis, you cannot know which result is
>> `better' unless you know an interpretation.
>>
>>
>>
>>> I noticed that the proportion of the variance for the first  pca without
>>> standardizing had a larger  value . Is there a meaning to it ? Isnt this
>>> always the case?
>>>  At last , if I am  supposed to predict a variable ie weight should I
>>> drop
>>> the variable ie weight from my data matrix when I do principal component
>>> analysis ?
>>
>>
>> This sounds a bit like homework. If that is the case, please ask your
>> teacher rather than this list.
>> Anyway, it does not make sense to predict weight using a linear
>> combination (principle component) that contains weight, does it?
>>
>> Uwe Ligges
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/




More information about the R-help mailing list