[R] Q and R mode in Principal Component Analysis

Wed Sep 7 00:01:05 CEST 2011

At 4:10 PM +0100 9/6/11, Lívio Cipriano wrote:
>Hi,
>
>Can anyone explain me the differences in Q and R mode in Principal Component
>Analysis, as performed by prcomp and princom respectively.

Dear Livio,
   The help file of prcomp says it pretty well:

"The calculation is done by a singular value 
decomposition of the (centered and possibly 
scaled) data matrix, not by using eigen on the 
covariance matrix. This is generally the 
preferred method for numerical accuracy. "
with the help file from princomp:
princomp only handles so-called R-mode PCA, that 
is feature extraction of variables. If a data 
matrix is supplied (possibly via a formula) it is 
required that there are at least as many units as 
variables. For Q-mode PCA use prcomp.

This R and Q (as well as S and T) terminology was 
introduced (at least in psychology) by Ray 
Cattell in his discussion of the "Data Box".  It 
is the idea that you can consider three 
dimensions of data (across subjects, variables, 
and time).  Then there are six different ways to 
cut up the data.  A typical data matrix has rows 
for observations and columns for variables. 
Typically the number of rows >> columns.  If you 
are trying to find a structure that reduces the 
complexity of the variables, you do the normal 
analysis (R) of the variables.  An alternative is 
do the analysis on the transpose of the data 
matrix (Q analysis).  That is, to try to reduce 
the complexity of the rows.

This is not a problem if you do aingular value 
decomposition (which is what prcomp does).  It 
can be if you do a princomp analysis which is 
based upon the covariance of the data.

Let nXv  represent your original matrix.  (n 
observations on v variables).  For an R analysis, 
using princomp, you are finding the principal 
components of the covariance matrix C which is of 
size v x v with rank = the lesser of n and v. But 
for a Q analysis,  if you are using princomp, you 
are still trying to find the principal components 
of a covariance matrix C* which has dimensions n 
x n but has a rank of the lesser of n and v.

  That is, if  the number of  rows > number of 
columns  the rank of the covariance matrix of the 
transposed matrix will still be the number of 
columns although the size of the correlation 
matrix will be n x n.

Q analysis is looking for patterns of similarity 
in the subjects over variables, R analysis is 
looking for similarity in the variables over 
subjects.  This then gets generalized to the case 
of subjects over time, variables, over time, ....

"The data box emphasized that we are not limited 
to correlating tests over people at one time. In 
its 1946 formulation, there were six 'designs of 
covariation using literal measurement' and 12 
'designs of covariation using differential or 
ratio measurement' (Cattell, 1946c, p 94-95). 
Considering Persons, Tests, and Occasions as the 
fundamental dimensions, it was possible to 
generalize the normal correlation of Tests over 
Persons design (R analysis) to consider how 
Persons correlated over Tests (Q analysis), or 
Tests over Occasions (P analysis), etc. Cattell 
(1966) extended the data box's original three 
dimensions to five by adding Background or 
preceding conditions as well as Observers (see 
also Cattell (1977)). Applications of the data 
box concept have been seen throughout psychology, 
but the primary influence has probably been on 
those who study personality development and 
change over the life span (McArdle & Bell, 2000, 
Mroczek, 2007, Nesselroade, 1984). Unfortunately, 
even for the original three dimensions, Cattell 
(1978) used a different notation than he did in 
Cattell (1966, 1977) or Cattell (1946b)."
British Journal of Psychology (2009), 100, 253-257
q 2009 The British Psychological Society

[1]	R. B. Cattell. The data box: Its ordering 
of total resources in terms of possible 
relational systems. In R. B. Cattell, editor, 
Handbook of multivariate experimental psychology, 
pages 67-128. Rand-McNally, Chicago, 1966.

  I suspect this is more than you wanted to know.

Bill

>
>Regards
>
>Lívio Cipriano
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.