[R] Avoiding for loops

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Mon Nov 2 12:30:26 CET 2009


Dimitris Rizopoulos wrote:
> you could try something along these lines:
> 
> data <- data.frame(y = rnorm(100), group = rep(1:10, each = 10))
> 
> data$sum <- ave(data$y, data$group, FUN = sum)
> data$norm.y <- data$y / data$sum
> data

.. or even

transform(data, norm=ave(y, group, FUN = function(x) x/sum(x)))

> 
> I hope it helps.
> 
> Best,
> Dimitris
> 
> 
> Noah Silverman wrote:
>> Hi,
>>
>> I'm trying to normalize some data.
>> My data is organized by groups.  I want to normalize PER GROUP as
>> opposed to over the entire data set.
>>
>> The current double loop that I'm using takes almost an hour to run on
>> about 30,000 rows of data in 2,500 groups.
>>
>> I'm currently doing this:
>>
>> -------------------------------------
>> for(group in unique(data$group)){
>>     sum_V1 <- sum(data$V1[data$group == group])
>>
>>     for(subject in data$subject[data$group == group]){
>>         data$V1_norm[(data$group == group & data$subject == subject)]
>> <- data$V1[(data$group == group & data$subject == subject)] / sum_V1
>>     }
>> }
>> -------------------------------------
>>
>> Can anyone point me to a faster way to do this in R.
>>
>> Thanks!
>>
>> -N
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list