[R] scale subsets of grouped data in data frame

Steve Lianoglou mailinglist.honeypot at gmail.com
Sat Aug 1 03:38:52 CEST 2009


On Jul 31, 2009, at 7:17 PM, Noah Silverman wrote:

> Hello,
> I'm trying to duplicate what's an easy process in RapidMiner.
> In RM, we can simply use two operators:
>     subgroup iteration
>     attribute value selection (Can use a regex for the attrribute  
> name.)
> I can do this in R with a lot of code and manual steps.  It would be
> really nice to find a more automated way.
> My data looks like this
> group 	group_height 	group_weight 	height 	weight
> g22 	3.2 	8.896 	3.2 	8.896
> g22 	2.5 	6.95 	2.5 	6.95
> g22 	3.1 	8.618 	3.1 	8.618
> g49 	2.4 	6.672 	2.4 	6.672
> g49 	4.2 	11.676 	4.2 	11.676
> g49 	2.5 	6.95 	2.5 	6.95
> g55 	2.6 	7.228 	2.6 	7.228
> g55 	3.4 	9.452 	3.4 	9.452
> g55 	3.3 	9.174 	3.3 	9.174
> What I want to do is scale the data by each group
> So in pseudo-code
>     for(group in groups){
>         if(column_name = regex(group_.*)){
>             data[column_name] = scale(data[group,column_name])
>         }
>     }
> This way I get "group wise" normalization of my data, but still have  
> the
> original values which I will normailze "database wide" for some  
> comparisons.
> Can anybody help solve this one?
> -N

You can do this quite easily.

Just take what you learned from the last example re: scaling subsets,  
and play around with some of the functions you see in the ?grep help  
page. You'll be using those functions against the strings you get back  
from colnames(data).


Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the R-help mailing list