[R] Combining imputed datasets for analysis using Factor Analysis
Conrad Zygmont
zygmontc at hbc.ac.za
Mon Aug 20 16:19:36 CEST 2012
Dear R users and developers,
I have a dataset containing 34 variables measured in a survey, which has
some missing items. I would like to conduct a factor analysis of this
data. I tested mi, Amelia, and MissForest as alternative packages in
order to impute the missing data. I now have 5 separate datasets with
the variables I am interested in factor analysing. In my reading of the
package help files, various articles and books I have come across a
number of suggestions for combining analyses (mostly regression or other
linear models) using Rubin's (1987) rules.
However, I am not sure how I should proceed in the case of factor
analysis. Should I calculate the covariance matrix or correlation matrix
for my dataset, combine these estimates and then perform a factor
analysis. Or should I conduct a FA of each complete imputed dataset and
then combine the results (say eigenvalues or fit statistics)? Could
anyone guide me to literature (if possible, not overly technical) that
would guide me in this regard? Or provide an example of a script that
would help me achieve this?
Your assistance and time is much appreciated.
Kind Regards,
Conrad Zygmont
Psychology Department
Helderberg College
South Africa
Additional info:
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Running on Linux version 3.3.8-gentoo (root at PsychStat) (gcc version
4.5.3 (Gentoo 4.5.3-r2 p1.5, pie-0.4.7) )
Script for multiple imputation:
> var.info <- mi.info(LRN)
> var.info
> var.info <- update(var.info, "type", list("LRN1" =
"ordered-categorical", "LRN2" = "ordered-categorical", "LRN3" =
"ordered-categorical", "LRN4" = "ordered-categorical", "LRN5" =
"ordered-categorical", "LRN6" = "ordered-categorical", "LRN7" =
"ordered-categorical", "LRN8" = "ordered-categorical", "LRN9" =
"ordered-categorical", "LRN10" = "ordered-categorical", "LRN11" =
"ordered-categorical", "LRN12" = "ordered-categorical", "LRN13" =
"ordered-categorical", "LRN14" = "ordered-categorical", "LRN15" =
"ordered-categorical", "LRN16" = "ordered-categorical", "LRN17" =
"ordered-categorical", "LRN18" = "ordered-categorical", "LRN19" =
"ordered-categorical", "LRN20" = "ordered-categorical", "LRN21" =
"ordered-categorical", "LRN22" = "ordered-categorical", "LRN23" =
"ordered-categorical", "LRN24" = "ordered-categorical", "LRN25" =
"ordered-categorical", "LRN26" = "ordered-categorical", "LRN27" =
"ordered-categorical", "LRN28" = "ordered-categorical", "LRN29" =
"ordered-categorical", "LRN30" = "ordered-categorical", "LRN31" =
"ordered-categorical", "LRN32" = "ordered-categorical", "LRN33" =
"ordered-categorical", "LRN34" = "ordered-categorical"))
> prepared.data <- mi.preprocess(SOC, info = var.info)
> prepared.data <- mi.preprocess(LRN, info = var.info)
> ImpLRN <- mi(prepared.data, n.imp = 5, n.iter = 50,
check.coef.convergence = TRUE, add.noise = noise.control(post.run.iter =
30))
> LRN.imputed <- mi.completed(ImpLRN)
> LRN.first <- mi.data.frame(ImpLRN, m=1)
> cov.mat <- polychoric(LRN.first,std.err=TRUE)
... and so on
More information about the R-help
mailing list