[R] Several PCA questions...
Dan Bolser
dmb at mrc-dunn.cam.ac.uk
Tue Jun 29 12:04:18 CEST 2004
Hi, I am doing PCA on several columns of data in a data.frame.
I am interested in particular rows of data which may have a particular
combination of 'types' of column values (without any pre-conception of
what they may be).
I do the following...
# My data table.
allDat <- read.table("big_select_thresh_5", header=1)
# Where some rows look like this...
# PDB SUNID1 SUNID2 AA CH IPCA PCA IBB BB
# 3sdh 14984 14985 6 10 24 24 93 116
# 3hbi 14986 14987 6 10 20 22 94 117
# 4sdh 14988 14989 6 10 20 20 104 122
# NB First three columns = row ID, last 6 = variables
attach(allDat)
# My columns of interest (variables).
part <- data.frame(AA,CH,IPCA,PCA,IBB,BB)
pc <- princomp(part)
plot(pc)
The above plot shows that 95% of the variance is due to the first
'Component' (which I assume is AA).
i.e. All the variables behave in quite much the same way.
I then did ...
biplot(pc)
Which showed some outliers with a numeric ID - How do I get back my old 3
part ID used in allDat?
In the above plot I saw all the variables (correctly named) pointing in
more or less the same direction (as shown by the variance). I then did the
following...
postscript(file="test.ps",paper="a4")
biplot(pc)
dev.off()
However, looking at test.ps shows that the arrows are missing (using
ggv)... Hmmm, they come back when I pstoimg then xv... never mind.
Finally, I would like to make a contour plot of the above biplot, is this
possible? (or even a good way to present the data?
Thanks very much for any feedback,
Dan.
More information about the R-help
mailing list