[R] Sapply
hadley wickham
h.wickham at gmail.com
Mon Aug 31 01:51:57 CEST 2009
On Sun, Aug 30, 2009 at 5:08 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
> Hi,
>
> I need a bit of guidance with the sapply function. I've read the help page,
> but am still a bit unsure how to use it.
>
> I have a large data frame with about 100 columns and 30,000 rows. One of
> the columns is "group" of which there are about 2,000 distinct "groups".
>
> I want to normalize (sum to 1) one of my variables per-group.
>
> Normally, I would just write a huge "for each" loop, but have read that is
> hugely inefficient with R.
>
> The old way would be (just an example, syntax might not be perfect):
>
> for (group in data$group){
> for (score in data[data$group == group]){
> new_score <- score / sum(data$score[data$group==group])
> }
> }
It might be easier to use ddply from the plyr package. The command
you want would be:
data <- ddply(data, "group", transform, score = score / sum(score))
More information at http://had.co.nz/plyr.
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list