[R] R vs SPSS output for princomp
James Howison
jhowison at syr.edu
Tue May 6 02:43:26 CEST 2003
Hi,
I am using R to do a principal components analysis for a class
which is generally using SPSS - so some of my question relates to
SPSS output (and this might not be the right place). I have
scoured the mailing list and the web but can't get a feel for this.
It is annoying because they will be marking to the SPSS output.
Basically I'm getting different values for the component loadings
in SPSS and in R - I suspect that there is some normalization or
scaling going on that I don't understand (and there is plenty I
don't understand). The scree-plots (and thus eigen values for each
component) and Proportion of Variance figures are identical - but
the factor loadings are an order of magnitude different. Basically
the SPSS loadings are much higher than those shown by R.
Should the loadings returned by the R princomp function and the
SPSS "Component Matrix" be the same?
And subsidiary question would be: How does one approximate the
"Kaiser's little jiffy" test for extracting the components (SPSS
by default eliminates those components with eigen values below 1)?
I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting
the scree plot (to set x) - but is there another way?
The full R commands and SPSS syntax follow below along with the
differing output.
Thanks, James
http://freelancepropaganda.com
R analysis
===========
I run:
> library(mva)
> DVfmla
~webeval1 + webeval2 + webeval3 + webeval4 + webeval5 + webeval6 +
webeval7 + webeval8
> loadings(DV.pca <- princomp(DVfmla, scale=T, cor=T))
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
webeval1 -0.357 0.258 -0.202 0.458 0.629 -0.350 0.112 -0.159
webeval2 -0.340 0.510 0.255 -0.305 0.651 0.136 -0.143
webeval3 -0.319 0.316 -0.276 -0.797 0.244 -0.145
webeval4 0.247 0.633 0.681 -0.248
webeval5 0.391 0.150 -0.357 -0.183 -0.158 -0.185 0.584 -0.513
webeval6 0.392 0.252 -0.282 0.140 -0.756 -0.334
webeval7 -0.382 0.128 -0.162 -0.651 -0.596 -0.114 0.121
webeval8 0.377 0.268 -0.428 0.158 0.143 0.746
<snip SS loadings>
>plot(DV.pca) # This is exactly the same as the SPSS scree-plot.
SPSS Analysis
=============
FACTOR
/VARIABLES webeval1 webeval2 webeval3 webeval4
webeval5 webeval6 webeval7 webeval8
/MISSING LISTWISE
/ANALYSIS webeval1 webeval2 webeval3 webeval4
webeval5 webeval6 webeval7 webeval8
/PRINT INITIAL EXTRACTION
/PLOT EIGEN
/CRITERIA FACTORS(8) ITERATE(25)
/EXTRACTION PC
/ROTATION NOROTATE
/METHOD=CORRELATION .
As mentioned the proportions of varience explained and the scree
plot are identical. However SPSS produces this "Component Matrix"
which we, in class, have been calling "the loadings":
WEBEVAL1 -0.798 0.253 0.178 0.317 -0.370 0.167 -0.033 -0.037
WEBEVAL2 -0.764 0.487 0.026 0.188 0.186 -0.309 -0.108 -0.043
WEBEVAL3 -0.719 0.309 0.217 -0.564 -0.125 -0.040 0.043 0.052
WEBEVAL4 0.558 0.591 -0.563 -0.063 -0.029 0.131 0.030 -0.019
WEBEVAL5 0.864 0.161 0.313 -0.128 0.075 0.138 -0.221 -0.200
WEBEVAL6 0.876 0.252 0.237 0.100 0.008 0.017 -0.088 0.308
WEBEVAL7 -0.858 0.128 0.133 0.054 0.349 0.308 0.090 0.037
WEBEVAL8 0.847 0.256 0.316 0.111 0.000 -0.087 0.296 -0.094
Can anyone tell me why these are different (It seems likely that
this is a scaling of some kind as the SPSS ones just look to have
been made larger in some way). Or is it that SPSS is reporting
cumulatively while R is not?
Thanks in advance,
James
More information about the R-help
mailing list