[R] Averaging over data sets

Joshua Wiley jwiley.psych at gmail.com
Fri Jan 13 07:52:48 CET 2012


Hi,

I might write a little function that does different things depending
on the class of the variable.  Along the lines of:

where i is a column index:

function(i) {
if (is.numeric(imputeddata[, i])) {
  something
} else if (is.factor(imputeddata[, i])) {
  something else
} etc.

then you can just do:

combined <- lapply(1:nrow(imputeddata), yourfun)

Alternately, you could consider some single imputation approaches
since that is what you essentially end up doing.

Cheers,

Josh

On Thu, Jan 12, 2012 at 10:16 PM, Felipe Nunes <felipnunes at gmail.com> wrote:
> Hi all,
>
> after using Amelia II to create 10 imputed data sets I need to average them
> to have one unique data that includes the average for each cell of the
> variables imputed, in addition to the values for the variables not imputed.
> Such data has many variables (some numeric, other factors), and more than
> 20000 observations. I do not know how to average them out. Any help?
>
> Below I provide a small example:
>
> Suppose Amelia provided two datasets:
>
> d1 <- data.frame(subject = c("Felipe", "John"), eat1 = 1:2, eat3 = 5:6, trt
> = c("t1", "t2"))
>
> d2 <- data.frame(subject = c("Felipe", "John"), eat1 = 3:4, eat3 = 6:7, trt
> = c("t1", "t2"))
>
> I tried
>
> (d1 + d2)/2
>
> but I lose my factors. mean() did not work either.
>
> The result I'd like is:
>
>     subject  eat1  eat3   trt
> 1   Felipe     2      5.5     t1
> 2     John      3      6.5     t2
>
> thanks,
>
> *Felipe Nunes*
> CAPES/Fulbright Fellow
> PhD Student Political Science - UCLA
> Web: felipenunes.bol.ucla.edu
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list