[R] Better way of Grouping?

Sat Sep 29 00:09:28 CEST 2012

On Sep 28, 2012, at 11:59 AM, Charles Determan Jr wrote:

> Hello R users,
> 
> This is more of a convenience question that I hope others might find useful
> if there is a better answer.  I work with large datasets that requires
> multiple parsing stages for different analysis.  For example, compare group
> 3 vs. group 4.  A more complicated comparison would be time B in group 3 of
> group L with B in group 4 of group L.  I normally subset each group with
> the following type of code.
> 
> data=read(...)
> 
> #L v D
> L=data[LvD %in% c("L"),]
> D=data[LvD %in% c("D"),]
> 
> #Groups 3 and 4 within L and D
> group3L=L[group %in% c("3"),]
> group4L=L[group %in% c("3"),]

Assume you meant to have a "4" there
> 
> group3D=D[group %in% c("3"),]
> group4D=D[group %in% c("3"),]

Ditto. Only makes sense with a "4".

The usual way is to use:

lapply( split(data, interaction(data$LvD, data$group)) ,
         fun( subdf) {<do something with subdf>} )

That way you do not end up littering you workspace with subsidiary subsets of you main data object.

> 
> #Times B, S45, FR2, FR8
> you get the idea
> 
> 
> Is there a more efficient way to subset groups?  Thanks for any insight.
> 
-- 

David Winsemius, MD
Alameda, CA, USA