[R] Conditional sum

Dieter Menne dieter.menne at menne-biomed.de
Sun Feb 20 18:11:05 CET 2011



mathijsdevaan wrote:
> 
> I have a DF like this:
> 
> DF = data.frame(read.table(textConnection("    A  B  C 
> 1 b1  1999  0.25
> 2 c1  1999  0.25
> ..
> For each factor in A I want to sum the values of C for all years(Bn) prior
> to the current year(Bi):
> 
> 1 b1  1999  0.25  0
> 2 c1  1999  0.25  0.4
> 3 d1  1999  0.25  0
> 
> 
In steps following the "thinking order". You could shorten this
considerably. I slightly changed you column names to more speakable ones.

Dieter


DF = data.frame(read.table(textConnection("    group  year  C
1 b1  1999  0.25
2 c1  1999  0.25
3 d1  1999  0.25
4 a2  1999  0.25
5 c2  1999  0.25
6 d2  1999  0.25
7 a3  1999  0.25
8 b3  1999  0.25
9 d3  1999  0.25
10 a4  1999  0.25
11 b4  1999  0.25
12 c4  1999  0.25
13 b1  2001  0.5
14 a2  2001  0.5
15 b1  2004  0.33
16 c1  2004  0.33
17 a2  2004  0.33
18 c2  2004  0.33
19 a3  2004  0.33
20 b3  2004  0.33
21 d2  1980  0.4
22 a3  1980  0.4
23 b4  1981  0.4
24 c1  1981  0.4"),head=TRUE))

by(DF,DF$group, FUN = function(x){
  print(str(x))
})
# Looks like we should order...
# Other solutions are possible, but ordering all first might (not tested)
# be the most efficient way for large sets
DF = DF[order(DF$group,DF$year),]
# Let's try cumsum on each group
by(DF,DF$group, FUN = function(x){
  cumsum(x$C)
})
# That's not exactly your defininition of "prior"
# correct for first value
by(DF,DF$group, FUN = function(x){
  cumsum(x$C)-x$C
})
# Now the data are  in right order, make vector of result
DF$D = unlist(by(DF,DF$group, FUN = function(x){
  cumsum(x$C)
}))
# You could sort by row names now to restore the old order

-- 
View this message in context: http://r.789695.n4.nabble.com/Conditional-sum-tp3315163p3315279.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list