[R] summarizing a complex dataframe

David Winsemius dwinsemius at comcast.net
Wed Jan 11 23:12:07 CET 2012


On Jan 11, 2012, at 3:55 PM, Christopher G Oakley wrote:

> I need some help summarizing complex data frames (small example  
> below):
>
>    m1_1 m2_1 m3_1 m1_2 m2_2 m3_2
> i1    1    1    1    2    2    2
> i1    2    1    1    2    2    2
> i2    2    2    1    2    2    2
>
>
> For an arbitrary number of columns (say m1 …. m199) where the column  
> names have variable patterns,
>
> and such that each set of columns is repeated (with potentially  
> unique data) an arbitrary number of times (say _1 … _1000),
>
> I would like to summarize by row the mean values of (m1, m2, m3, …  
> m199) over all replicates (_1, _2, _3, … _1000). I need to do this  
> with a large number of dataframes of variable nrow, ncolumn, and  
> colnames.

Something along the lines of this untested code:

sapply(unique(sub("_.+$", "", names(dfrm))),
           function(x)  rowMeans( dfrm[ , grep(x, names(dfrm)) ] )
         )

Post a reproducible example and we can test it.

>
> I've tried various loops creating new dataframes and reassigning  
> cell values in loops or using rbind and bind, but run into trouble  
> in each case.
>
> Any ideas?
>
> Thanks,
>
> Chris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list