[R] Can't reproduce clusplot princomp results.

Bjørn-Helge Mevik bhx2 at mevik.net
Tue May 24 09:19:54 CEST 2005


Thomas M. Parris writes:

> clusplot reports that the first two principal components explain
> 99.7% of the variability.
[...]

>> loadings(pca)
[...]
>                Comp.1 Comp.2 Comp.3 Comp.4
> SS loadings      1.00   1.00   1.00   1.00
> Proportion Var   0.25   0.25   0.25   0.25
> Cumulative Var   0.25   0.50   0.75   1.00

This has nothing to do with how much of the variability of the
original data that is captured by each component; it merely measures
the variability in the coefficients of the loading vectors (and they
are standardised to length one in princomp)

What you want to look at is pca$sdev, for instance something like

totvar <- sum(pca$sdev^2)
rbind("explained var" = pca$sdev^2,
      "prop. expl. var" = pca$sdev^2/totvar,
      "cum.prop.expl.var" = cumsum(pca$sdev^2)/totvar)
                     Comp.1    Comp.2      Comp.3       Comp.4
explained var     3.4093746 0.5785399 0.011560142 0.0005252824
prop. expl. var   0.8523437 0.1446350 0.002890036 0.0001313206
cum.prop.expl.var 0.8523437 0.9969786 0.999868679 1.0000000000

And as you can see, two comps "explain" 99.7%. :-)

-- 
Bjørn-Helge Mevik




More information about the R-help mailing list