[R] KMO sampling adequacy and SPSS -- partial solution
Ashish Ranpura
buddhahead at ranpura.com
Thu Dec 8 00:08:41 CET 2005
Dear colleagues,
I've been searching for information on the Kaiser-Meyer-Olkin (KMO)
Measure of Sampling Adequacy (MSA). This statistic is generated in
SPSS and is often used to determine if a dataset is "appropriate" for
factor analysis -- it's true utility seems quite low, but it seems to
come up in stats classes a lot. It did in mine, and a glance through
the R-help archives suggests I'm not alone.
I finally found a reference describing the calculation, and wrote the
following R function to perform it. Note that the function depends on
a partial correlation function found in library(corpcor).
kmo.test <- function(df){
###
## Calculate the Kaiser-Meyer-Olkin Measure of Sampling Adequacy.
## Input should be a data frame or matrix, output is the KMO statistic.
## Formula derived from Hutcheson et al, 1999,
## "The multivariate social scientist," page 224, ISBN 0761952012
## see <http://www2.chass.ncsu.edu/garson/pa765/hutcheson.htm>
###
cor.sq = cor(df)^2
cor.sumsq = (sum(cor.sq)-dim(cor.sq)[1])/2
library(corpcor)
pcor.sq = cor2pcor(cor(df))^2
pcor.sumsq = (sum(pcor.sq)-dim(pcor.sq)[1])/2
kmo = sus.cor.ss/(sus.cor.ss+sus.pcor.ss)
return(kmo)
}
Also, for those trying to reproduce the SPSS factor analysis output,
(-1 * cor2pcor(cor(yourDataFrame))) will produce the "anti-image
correlation" matrix. Unfortunately, the most useful property of that
matrix in SPSS is that the diagonals represent the individual MSA
values -- I haven't found a way to derive those yet. Still working on
that, any suggestions appreciated.
--Ash.
-----
Ashish Ranpura
Institute of Cognitive Neuroscience
University College London
17 Queen Square
London WC1N 3AR
tel: +44 (20) 7679 1126
web: http://www.icn.ucl.ac.uk
More information about the R-help
mailing list