[R] factanal

Christof Schuster cschuste at nd.edu
Thu May 1 20:41:19 CEST 2003


# I have a question about how factanal is calculating the regression factor
# scores based on an oblique rotation (promax) of the factors. 
#
# As is explained in the help file, regression factor scores are
# obtained as
#
#  hat f = Lambda' Sigma^-1 x
#
# However, according to Harman's "Modern Factor Analysis" (e.g. second
# edition, pp. 351-352) the formula is 
#
#  hat f = Phi Lambda' Sigma^-1 x,
#
# where Phi is the correlation between factors. Of course, for orthogonal 
# rotations the formulas are identical because in this case Phi=I.
#
# Let me illustrate the difference between the formulas numerically:
# X is a data (10x6) data matrix of standardized variables.

"X" <-
structure(c(-0.697, -1.787, 0.206, -0.191, -0.606, 0.171, 1.46, 
-0.639, 0.779, 1.304, -0.7, -1.538, -0.913, -0.43, -0.225, -0.417, 
1.038, 0.888, 1.595, 0.702, -1.268, -2.018, 0.079, 1.074, 0.296, 
-0.62, 0.532, 0.306, 0.775, 0.844, -2.245, 0.486, 0.801, 0.002, 
-0.602, -0.519, 1.261, -0.372, 0.499, 0.688, -1.973, -0.163, 
0.964, -0.071, -0.99, 0.694, -0.364, -0.305, 1.215, 0.992, -1.674, 
-0.065, 1.043, -0.159, -1.174, 0.648, 0.848, 1.101, -1.055, 0.488
), .Dim = c(10, 6))

library(mva)
res <- factanal(X, factors=2, rotation="promax", scores="regression")
round(res$scores, 3)

# yields the factor scores:
#        Factor1 Factor2
#   [1,]   0.031  -2.137
#   [2,]  -2.319   1.141
#   [3,]  -0.899   1.375
#   [4,]   0.098  -0.058
#   [5,]   0.215  -0.943
#   [6,]  -0.402   0.218
#   [7,]   0.828   0.488
#   [8,]   0.364  -0.364
#   [9,]   1.313  -0.204
#  [10,]   0.771   0.485

# Now, calculating the factor scores using the first formula I obtain
# (up to rounding differences) the same result
R <- cor(X)
L <- loadings(res)
round(t(t(L) %*% solve(R) %*% t(X)), 3)

# However, if I use the following SAS commands using the exactly the
# same data

# data x;
#    input x1 x2 x3 x4 x5 x6;
#    cards;
# -0.697 -0.700 -1.268 -2.245 -1.973 -1.674
# -1.787 -1.538 -2.018  0.486 -0.163 -0.065
#  0.206 -0.913  0.079  0.801  0.964  1.043
# -0.191 -0.430  1.074  0.002 -0.071 -0.159
# -0.606 -0.225  0.296 -0.602 -0.990 -1.174
#  0.171 -0.417 -0.620 -0.519  0.694  0.648
#  1.460  1.038  0.532  1.261 -0.364  0.848
# -0.639  0.888  0.306 -0.372 -0.305  1.101
#  0.779  1.595  0.775  0.499  1.215 -1.055
#  1.304  0.702  0.844  0.688  0.992  0.488
# ;
#
# proc factor data=x corr method=ml nfactors=2 prerotate=varimax rotate=promax out=fscores score;
# run;
#
# proc print data=fscores;
#   var factor1 factor2;
# run;
#
# The factor scores one obtains:
#     Obs     Factor1     Factor2
#       1    -1.00830    -2.12208
#       2    -1.76352     0.09283
#       3    -0.23031     0.96792
#       4     0.06982    -0.01378
#       5    -0.24412    -0.84552
#       6    -0.29643     0.03564
#       7     1.06502     0.86195
#       8     0.18723    -0.19927
#       9     1.21346     0.38882
#      10     1.00714     0.83347

# Clearly, these scores are very different from the ones factanal
# produced. However, because the promax-function does not calculate
# the Phi-matrix, one cannot readily calculate the factor scores
# according to the second formula from Harman's book.

# However, one may calculate Phi in the following way using varimax as
# the prerotation:
res <- factanal(X, factors=2, rotation="none")
vm <- varimax(loadings(res))
pm <- promax(loadings(vm), m=3) # m=3 is default in SAS
L <- loadings(pm)

# Calculate the "normalized oblique transformation matrix" (label from
# SAS output)
A <- vm$rotmat %*% pm$rotmat

# Calculate the "inter-factor correlation matrix" (label from SAS output
Phi <- solve(t(A) %*% A)

# Calculating regression factor scores according to the second formula
# as
t(Phi %*% t(L) %*% solve(R) %*% t(X))
# yields scores that are pretty close, although not identical to the 
# SAS factor scores. Specifically the scores one obtains are
#
# [1,] -0.95635745 -2.12230012
# [2,] -1.79135903  0.08454732
# [3,] -0.26372828  0.96485151
# [4,]  0.07127957 -0.01351224
# [5,] -0.22118873 -0.84499215
# [6,] -0.30174316  0.03427963
# [7,]  1.05315509  0.86483663
# [8,]  0.19603657 -0.19803312
# [9,]  1.21844347  0.39342954
#[10,]  0.99537610  0.83626200

# So, I guess, I am wondering does factanal calculate the
# regression factor scores correctly? Maybe I am missing
# something. Any comments will be highly appreciated.

# Thanks,
#   Christof Schuster

Christof Schuster
University of Notre Dame
Department of Psychology                       
103 Haggar Hall
Notre Dame, IN 46556

Tel: (574) 631-5473    email: cschuste at nd.edu
Fax: (574) 631-8883    www.nd.edu/~cschuste



More information about the R-help mailing list