[R] Sapply

hadley wickham h.wickham at gmail.com
Mon Aug 31 01:51:57 CEST 2009


On Sun, Aug 30, 2009 at 5:08 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
> Hi,
>
> I need a bit of guidance with the sapply function.  I've read the help page,
> but am still a bit unsure how to use it.
>
> I have a large data frame with about 100 columns and 30,000 rows.  One of
> the columns is "group" of which there are about 2,000 distinct "groups".
>
> I want to normalize (sum to 1) one of my variables per-group.
>
> Normally, I would just write a huge "for each" loop, but have read that is
> hugely inefficient with R.
>
> The old way would be (just an example, syntax might not be perfect):
>
> for (group in data$group){
>    for (score in data[data$group == group]){
>        new_score <- score / sum(data$score[data$group==group])
>    }
> }

It might be easier to use ddply from the plyr package.  The command
you want would be:

data <- ddply(data, "group", transform, score = score / sum(score))

More information at http://had.co.nz/plyr.

Hadley

-- 
http://had.co.nz/




More information about the R-help mailing list