# [R] pca vs. pfa: dimension reduction

Jonathan Baron baron at psych.upenn.edu
Wed Mar 25 19:22:04 CET 2009

```On 03/25/09 19:06, soeren.vogel at eawag.ch wrote:
> Can't make sense of calculated results and hope I'll find help here.
>
> variables. I hypothesise those three variables to be components (or
> indicators) of one latent factor. In order to reduce data (vars), I
> had the following idea: Calculate the factor underlying these three
> vars. Use the loadings and the original var values to construct an new
> (artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets
> for readability). Use ArtVar for further analysis of the data, that
> is, as predictor etc.
>
> In my (I realise, elementary) psychological statistics readings I was
> taught to use pca for these problems. Referring to Venables & Ripley
> (2002, chapter 11), I applied "princomp" to my vars. But the outcome
> shows 4 components -- which is obviously not what I want. Reading
> specified factor very fine. But since this is a contradiction to
> theoretical introductions in so many texts I'm completely confused
> whether I'm right with these calculations.
>
> (1) Is there an easy example, which explains the differences between
> pca and pfa? (2) Which R procedure should I use to get what I want?

Possibly what you want is the first principal component, which the
weighted sum that accounts for the most variance of the three
variables.  It does essentially what you say in your first paragraph.
So you want something like

p1 <- princomp(cbind(X1,X2,X3),scores=TRUE)
p1\$scores[,1]

The trouble with factanal is that it does a rotation, and the default
is varimax.  The first factor will usually not be the same as the
first principal component (I think).  Perhaps there is another
rotation option that will give you this, but why bother even to look?
(I didn't, obviously.)

Jon
--
Jonathan Baron, Professor of Psychology, University of Pennsylvania