[R] mva :: prcomp

Jonne Zutt j.zutt at ewi.tudelft.nl
Wed Mar 17 15:24:49 CET 2004


Dear R-list users,

I'm new to principal components and factor analysis.
I thought this method can be very useful for me to find relationships
between several variables (which I know there is, only don't know which
variables exactly and what kind of relation), so as a structure
detection method.

Now, I'm experimenting with the function prcomp from the mva package.
In my source code below, I of course expect one of the column to be
useless (I provided one duplicate column). I know both avg.EDGE.etc and
avg.DEGREE have a relation with sum.delivery.penalty.
E.g. the bigger avg.DEGREE, the smaller sum.delivery.penalty.

My question is about the output of prcomp.
I understand the cumulative proportion of variance of the third
principal component is 100%. Just like I expected.
I see the components are sorted. The one that explains the most variance
is listed first.

But, how can I figure out what these principal components are exactly?
For example PC1. Was is the exact meaning of it?
I assumed it is some linear combination of the variables I provided in
the call to prcomp, but how can i obtain this linear combination?

ps > i used http://www.statsoftinc.com/textbook/stfacan.html as a
reference, and help(prcomp/princomp) of course.

Thanks for any help!
Jonne.


# Read a table
dir = "..."
file = "..." # huge file, 12 Mb
stats = read.table(paste(dir, file, sep=""), header=TRUE)

# Select several columns
data = subset(stats, select =
         c(sum.delivery.penalty,
           avg.EDGE.IN.SHORTEST.PATH.COUNT,
           avg.EDGE.IN.SHORTEST.PATH.COUNT,
           avg.DEGREE))

require(mva)
pc2 = prcomp(data, retx = TRUE, center = TRUE,
             scale. = TRUE, tol = NULL)
pc2
summary(pc2)

--- gives the following output

> pc2
Standard deviations:
[1] 1.424074e+00 1.000000e-00 9.859080e-01 5.711682e-17

Rotation:
                                            PC1           PC2          
PC3
sum.delivery.penalty              -1.627945e-01 -1.539887e-12 
9.866600e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT   -6.976740e-01  2.413866e-16
-1.151131e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -6.976740e-01  2.013413e-17
-1.151131e-01
avg.DEGREE                         2.505027e-13 -1.000000e+00
-1.519375e-12
                                            PC4
sum.delivery.penalty              -1.118300e-17
avg.EDGE.IN.SHORTEST.PATH.COUNT    7.071068e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -7.071068e-01
avg.DEGREE                        -3.253830e-18
> summary(pc2)
Importance of components:
                         PC1   PC2   PC3      PC4
Standard deviation     1.424 1.000 0.986 5.71e-17
Proportion of Variance 0.507 0.250 0.243 0.00e+00
Cumulative Proportion  0.507 0.757 1.000 1.00e+00




More information about the R-help mailing list