factanal {stats}  R Documentation 
Perform maximumlikelihood factor analysis on a covariance matrix or data matrix.
factanal(x, factors, data = NULL, covmat = NULL, n.obs = NA,
subset, na.action, start = NULL,
scores = c("none", "regression", "Bartlett"),
rotation = "varimax", control = NULL, ...)
x 
A formula or a numeric matrix or an object that can be coerced to a numeric matrix. 
factors 
The number of factors to be fitted. 
data 
An optional data frame (or similar: see

covmat 
A covariance matrix, or a covariance list as returned by

n.obs 
The number of observations, used if 
subset 
A specification of the cases to be used, if 
na.action 
The 
start 

scores 
Type of scores to produce, if any. The default is none,

rotation 
character. 
control 
A list of control values,

... 
Components of 
The factor analysis model is
x = \Lambda f + e
for a p
–element vector x
, a p \times k
matrix \Lambda
of loadings, a k
–element vector
f
of scores and a p
–element vector e
of
errors. None of the components other than x
is observed, but
the major restriction is that the scores be uncorrelated and of unit
variance, and that the errors be independent with variances
\Psi
, the uniquenesses. It is also common to
scale the observed variables to unit variance, and done in this function.
Thus factor analysis is in essence a model for the correlation matrix
of x
,
\Sigma = \Lambda\Lambda^\prime + \Psi
There is still some indeterminacy in the model for it is unchanged
if \Lambda
is replaced by G \Lambda
for
any orthogonal matrix G
. Such matrices G
are known as
rotations (although the term is applied also to nonorthogonal
invertible matrices).
If covmat
is supplied it is used. Otherwise x
is used
if it is a matrix, or a formula x
is used with data
to
construct a model matrix, and that is used to construct a covariance
matrix. (It makes no sense for the formula to have a response, and
all the variables must be numeric.) Once a covariance matrix is found
or calculated from x
, it is converted to a correlation matrix
for analysis. The correlation matrix is returned as component
correlation
of the result.
The fit is done by optimizing the log likelihood assuming multivariate
normality over the uniquenesses. (The maximizing loadings for given
uniquenesses can be found analytically: Lawley & Maxwell (1971,
p. 27).) All the starting values supplied in start
are tried
in turn and the best fit obtained is used. If start = NULL
then the first fit is started at the value suggested by
Jöreskog (1963) and given by Lawley & Maxwell
(1971, p. 31), and then control$nstart  1
other values are
tried, randomly selected as equal values of the uniquenesses.
The uniquenesses are technically constrained to lie in [0, 1]
,
but nearzero values are problematical, and the optimization is
done with a lower bound of control$lower
, default 0.005
(Lawley & Maxwell, 1971, p. 32).
Scores can only be produced if a data matrix is supplied and used.
The first method is the regression method of Thomson (1951), the
second the weighted least squares method of Bartlett (1937, 8).
Both are estimates of the unobserved scores f
. Thomson's method
regresses (in the population) the unknown f
on x
to yield
\hat f = \Lambda^\prime \Sigma^{1} x
and then substitutes the sample estimates of the quantities on the
righthand side. Bartlett's method minimizes the sum of squares of
standardized errors over the choice of f
, given (the fitted)
\Lambda
.
If x
is a formula then the standard NA
handling is
applied to the scores (if requested): see napredict
.
The print
method (documented under loadings
)
follows the factor analysis convention of drawing attention to the
patterns of the results, so the default precision is three decimal
places, and small loadings are suppressed.
An object of class "factanal"
with components
loadings 
A matrix of loadings, one column for each factor. The
factors are ordered in decreasing order of sums of squares of
loadings, and given the sign that will make the sum of the loadings
positive. This is of class 
uniquenesses 
The uniquenesses computed. 
correlation 
The correlation matrix used. 
criteria 
The results of the optimization: the value of the criterion (a linear function of the negative loglikelihood) and information on the iterations used. 
factors 
The argument 
dof 
The number of degrees of freedom of the factor analysis model. 
method 
The method: always 
rotmat 
The rotation matrix if relevant. 
scores 
If requested, a matrix of scores. 
n.obs 
The number of observations if available, or 
call 
The matched call. 
na.action 
If relevant. 
STATISTIC, PVAL 
The significancetest statistic and P value, if it can be computed. 
There are so many variations on factor analysis that it is hard to compare output from different programs. Further, the optimization in maximum likelihood factor analysis is hard, and many other examples we compared had less good fits than produced by this function. In particular, solutions which are ‘Heywood cases’ (with one or more uniquenesses essentially zero) are much more common than most texts and some other programs would lead one to believe.
Bartlett, M. S. (1937). The statistical conception of mental factors. British Journal of Psychology, 28, 97–104. doi:10.1111/j.20448295.1937.tb00863.x.
Bartlett, M. S. (1938). Methods of estimating mental factors. Nature, 141, 609–610. doi:10.1038/141246a0.
Jöreskog, K. G. (1963). Statistical Estimation in Factor Analysis. Almqvist and Wicksell.
Lawley, D. N. and Maxwell, A. E. (1971). Factor Analysis as a Statistical Method. Second edition. Butterworths.
Thomson, G. H. (1951). The Factorial Analysis of Human Ability. London University Press.
loadings
(which explains some details of the
print
method), varimax
, princomp
,
ability.cov
, Harman23.cor
,
Harman74.cor
.
Other rotation methods are available in various contributed packages, including GPArotation and psych.
# A little demonstration, v2 is just v1 with noise,
# and same for v4 vs. v3 and v6 vs. v5
# Last four cases are there to add noise
# and introduce a positive manifold (g factor)
v1 < c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 < c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 < c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 < c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 < c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 < c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 < cbind(v1,v2,v3,v4,v5,v6)
cor(m1)
factanal(m1, factors = 3) # varimax is the default
factanal(m1, factors = 3, rotation = "promax")
# The following shows the g factor as PC1
prcomp(m1) # signs may depend on platform
## formula interface
factanal(~v1+v2+v3+v4+v5+v6, factors = 3,
scores = "Bartlett")$scores
## a realistic example from Bartholomew (1987, pp. 6165)
utils::example(ability.cov)