[R] PCA on high dimentional data

mail me mailme842 at googlemail.com
Sat Dec 10 16:56:35 CET 2011


Hi:

I have a large dataset mydata, of 1000 rows and 1000 columns. The rows
have gene names and columns have condition names (cond1, cond2, cond3,
etc).

mydata<- read.table(file="c:/file1.mtx", header=TRUE, sep="")

I applied PCA as follows:

data_after_pca<- prcomp(mydata, retx=TRUE, center=TRUE, scale.=TRUE);

Now i get 1000 PCs and i choose first three PCs and make a new data frame

new_data_frame<- cbind(data_after_pca$x[,1], data_after_pca$x[,2],
data_after_pca$x[,3]);

After the PCA, in the new_data_frame, i loose the previous cond1,
cond2, cond3 labels, and instead have PC1, PC2, PC3 as column names.

My question is, is there any way I can map the PC1, PC2, PC3 to the
original conditions, so that i can still have a reference to original
condition labels after PCA?

Thanks:
deb



More information about the R-help mailing list