[R-sig-eco] Should one remove highly correlated variables before doing PCA??

张勇 2010202035 at njau.edu.cn
Wed Mar 6 06:33:13 CET 2013


Hi list,

Maybe this is not a "R" question, however, it has bothered me for a long time. 

Some people think if a set of correlated variables might "load" onto several principal components (eigenvectors),so including many variables from such a set will differentially weight several eigenvectors--and thereby change the directions of all eigenvectors, too.  So, according to these considerations, we should discard some highly correlated variables before doing PCA.  

On the other hand, some people think that correlated variables are ok, because PCA outputs vectors that are orthogonal.  So we do not need to remove highly correlated variables before doing PCA.

However, for myself, I choose the first method (removing highly correlated variables). But, based on the practical ecology knowledge, I will retain most of the ecological meaningful variables as possible as I can.

What's your suggestion for this issue? Any hint will be greatly appreciated! Thanks a lot in advance.

Best regards,

Yong



More information about the R-sig-ecology mailing list