[R] Maintaining data order in factanal with missing data

Justin Delahunty ACU at genius.net.au
Fri Jul 26 14:22:16 CEST 2013


Hi Petr,

Thanks for the quick response. Unfortunately I cannot share the data I am
working with, however please find attached a suitable R workspace with
generated data. It has the appropriate variable names, only the data has
been changed.

The last function in the list (init.dfs()) I call to subset the overall data
set into the three waves, then conduct the factor analysis on each (1 factor
CFA); it's just in a function to ease re-typing in a new workspace.


Thanks,

Justin

-----Original Message-----
From: PIKAL Petr [mailto:petr.pikal at precheza.cz] 
Sent: Friday, 26 July 2013 7:35 PM
To: Justin Delahunty; r-help at r-project.org
Subject: RE: [R] Maintaining data order in factanal with missing data

Hi

You provided functions, so far so good. But without data it would be quite
difficult to understand what the functions do and where could be the issue.

I suspect combination of complete cases selection together with subset and
factor behaviour. But I can be completely out of target too.

Petr

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- 
> project.org] On Behalf Of s00123776 at myacu.edu.au
> Sent: Friday, July 26, 2013 9:35 AM
> To: r-help at r-project.org
> Subject: [R] Maintaining data order in factanal with missing data
> 
> Hi,
> 
> 
> 
> I'm new to R, so sorry if this is a simple answer. I'm currently 
> trying to collapse some ordinal variables into a composite; the 
> program ideally should take a data frame as input, perform a factor 
> analysis, compute factor scores, sds, etc., and return the rescaled 
> scores and loadings. The difficulty I'm having is that my data set 
> contains a number of NA, which I am excluding from the analysis using 
> complete.cases(), and thus the incomplete cases are "skipped". These 
> functions are for a longitudinal data set with repeated waves of data, 
> so the final rescaled scores from each wave need to be saved as 
> variables grouped by a unique ID (DMID). The functions I'm trying to 
> implement are as follows:
> 
> 
> 
> weighted.sd<-function(x,w){
> 
>                                 sum.w<-sum(w)
> 
>                                 sum.w2<-sum(w^2)
> 
>                                 mean.w<-sum(x*w)/sum(w)
> 
> 
> x.sd.w<-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2))
> 
>                                 return(x.sd.w)
> 
>                                 }
> 
> 
> 
> re.scale<-function(f.scores, raw.data, loadings){
> 
> 
> fz.scores<-(f.scores+mean(f.scores))/(sd(f.scores))
> 
> 
> means<-apply(raw.data,1,weighted.mean,w=loadings)
> 
> 
> sds<-apply(raw.data,1,weighted.sd,w=loadings)
> 
>                                 grand.mean<-mean(means)
> 
>                                 grand.sd<-mean(sds)
> 
> 
> final.scores<-((fz.scores*grand.sd)+grand.mean)
> 
>                                 return(final.scores)
> 
>                                 }
> 
> 
> 
> get.scores<-function(data){
> 
> 
> fact<-
> factanal(data[complete.cases(data),],factors=1,scores="regression")
> 
>                                 f.scores<-fact$scores[,1]
> 
>                                 f.loads<-fact$loadings[,1]
> 
>                                 rescaled.scores<-re.scale(f.scores,
> data[complete.cases(data),], f.loads)
> 
>                                 output.list<-list(rescaled.scores,
> f.loads)
> 
>                                 names(output.list)<- 
> c("rescaled.scores",
> "factor.loadings")
> 
>                                 return(output.list)
> 
>                                 }
> 
> 
> 
> init.dfs<-function(){
> 
> 
> ab.1.df<-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1))
> 
> 
> ab.2.df<-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1))
> 
>                                 ab.3.df<-subset(ab.df,,select=c(dmid,
> w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1))
> 
> 
> 
>                                 ab.1.fa<-get.scores(ab.1.df[-1])
> 
>                                 ab.2.fa<-get.scores(ab.2.df[-1])
> 
>                                 ab.3.fa<-get.scores(ab.3.df[-1])
> 
> 
>                                 }
> 
> 
> 
> Thanks for your help,
> 
> 
> 
> Justin
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html and provide commented, minimal, self-contained, 
> reproducible code.




More information about the R-help mailing list