[R] scale subsets of grouped data in data frame
Steve Lianoglou
mailinglist.honeypot at gmail.com
Sat Aug 1 03:38:52 CEST 2009
Hi,
On Jul 31, 2009, at 7:17 PM, Noah Silverman wrote:
> Hello,
>
> I'm trying to duplicate what's an easy process in RapidMiner.
>
> In RM, we can simply use two operators:
> subgroup iteration
> attribute value selection (Can use a regex for the attrribute
> name.)
>
> I can do this in R with a lot of code and manual steps. It would be
> really nice to find a more automated way.
>
> My data looks like this
>
> group group_height group_weight height weight
> g22 3.2 8.896 3.2 8.896
> g22 2.5 6.95 2.5 6.95
> g22 3.1 8.618 3.1 8.618
> g49 2.4 6.672 2.4 6.672
> g49 4.2 11.676 4.2 11.676
> g49 2.5 6.95 2.5 6.95
> g55 2.6 7.228 2.6 7.228
> g55 3.4 9.452 3.4 9.452
> g55 3.3 9.174 3.3 9.174
>
> What I want to do is scale the data by each group
> So in pseudo-code
> for(group in groups){
> if(column_name = regex(group_.*)){
> data[column_name] = scale(data[group,column_name])
> }
> }
>
> This way I get "group wise" normalization of my data, but still have
> the
> original values which I will normailze "database wide" for some
> comparisons.
>
> Can anybody help solve this one?
>
> -N
You can do this quite easily.
Just take what you learned from the last example re: scaling subsets,
and play around with some of the functions you see in the ?grep help
page. You'll be using those functions against the strings you get back
from colnames(data).
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the R-help
mailing list