# [R] pca vs. pfa: dimension reduction

William Revelle lists at revelle.net
Thu Mar 26 01:58:12 CET 2009

```Dear  Sören, Mark, and Jon,

At 12:51 PM -0700 3/25/09, Mark Difford wrote:
>Hi Sören,
>
>>>  (1) Is there an easy example, which explains the differences between
>>>  pca and pfa? (2) Which R procedure should I use to get what I want?
>
>There are a number of fundamental differences between PCA and FA (Factor
>Analysis), which unfortunately are quite widely ignored. FA is explicitly
>model-based, whereas PCA does not invoke an explicit model. FA is also
>designed to detect structure, whereas PCA focuses on variance, to put things
>simply. In more detail, the two methods "attack" the covariance matrix in
>different ways: in PCA the focus of decomposition is on the diagonal
>elements, whereas in FA the focus is on the off-diagonal elements.

This is nicely put.  Less concisely, see pages
139-149 of my (under development)
book on psychometric theory using R
(http://personality-project.org/r/book/Chapter6.pdf)
In particular, on page 149:

"Although on the surface, the component model and
factor model appear to very similar
(compare Tables 6.6 and 6.7), they are in fact
very different. One example of this is when an
matrix (Table 6.8). In this case, two additional
variables are added to the correlation matrix.
The factor pattern does not change, but the
component pattern does. Why is this? Because the
components are aimed at accounting for
all of the variance of the matrix, adding new
variables increases the amount of variance to be
explained and changes the previous estimates. But
the common part of the variables (that
which is estimated by factors) is not sensitive
to the presence (or absence) of other variables.
Although a fundamental difference between the two
models, this problem of the additional
variable is most obvious when there are not very
many variables and becomes less of an
empirical problem as the number of variables increases."

>
>Take a look at Prof. Revelle's psych package (funtion omega &c). Note also
>that factanal has a rotation = "none" option.
>
>Regards, Mark.
>
>
>soeren.vogel wrote:
>>
>>  Can't make sense of calculated results and hope I'll find help here.
>>
>>  variables. I hypothesise those three variables to be components (or
>>  indicators) of one latent factor. In order to reduce data (vars), I
>>  had the following idea: Calculate the factor underlying these three
>>  vars. Use the loadings and the original var values to construct an new
>>  (artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets
>>  for readability). Use ArtVar for further analysis of the data, that
>  > is, as predictor etc.

For 3 variables, there is only one factor
possible, so rotation is not a problem. (For 1
known correlations.  The model is just
identified. )

>  >
>>  In my (I realise, elementary) psychological statistics readings I was
>>  taught to use pca for these problems. Referring to Venables & Ripley
>>  (2002, chapter 11), I applied "princomp" to my vars. But the outcome
>>  shows 4 components -- which is obviously not what I want. Reading
>>  specified factor very fine. But since this is a contradiction to
>>  theoretical introductions in so many texts I'm completely confused
>>  whether I'm right with these calculations.

If you want to think of what these variables have
in common, use factor analysis, if you want to
summarize them all most efficiently with one
composite, use principal components.  These are
very different models.

As Mark said, the difference is that FA accounts
for the covariances (the off diagonal elements)
which reflect what the variables have in common.
PCS accounts for the entire matrix, which in a 3
x3 problem, is primarily the diagonal variances.

Bill

>  >
>>  (1) Is there an easy example, which explains the differences between
>>  pca and pfa? (2) Which R procedure should I use to get what I want?

>  >
>>  Thank you for your help
>>
>>  Sören
>>
>>
>>  Refs.:
>>
>>  Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics
>>  with S (4th edition). New York: Springer.
>>
>>  ______________________________________________
>>  R-help at r-project.org mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>  http://www.R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>--
>View this message in context:
>http://www.nabble.com/pca-vs.-pfa%3A-dimension-reduction-tp22707926p22709481.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>and provide commented, minimal, self-contained, reproducible code.

--
William Revelle		http://personality-project.org/revelle.html
Professor			http://personality-project.org/personality.html
Department of Psychology             http://www.wcas.northwestern.edu/psych/
Northwestern University	http://www.northwestern.edu/
Attend  ISSID/ARP:2009               http://issid.org/issid.2009/

```