[R] understanding output of tapply/by cumsum

Gerrit Draisma g.draisma at erasmusmc.nl
Wed Dec 8 16:56:46 CET 2010


Thanks Jim,
"Ave" does what I wanted.
It is simpler and probably more efficient
than unlisting Sn as I tried.

Still I remain puzzled with the structure
of the by() or tapply() output and how
to access the individual cumsums.

Yes the split command is useful for checking
the result.
Gerrit.


Op 12/7/2010 1:43 PM, jim holtman schreef:
> Maybe 'ave' is what you were looking for:
>
>> d$cum<- ave(d$n, d$a, d$c, FUN = cumsum)
>> d
>     a b c    n   cum
> 1  1 1 1 11.1  11.1
> 2  2 1 1 21.1  21.1
> 3  3 1 1 31.1  31.1
> 4  4 1 1 41.1  41.1
> 5  5 1 1 51.1  51.1
> 6  1 2 1 12.1  23.2
> 7  2 2 1 22.1  43.2
> 8  3 2 1 32.1  63.2
> 9  4 2 1 42.1  83.2
> 10 5 2 1 52.1 103.2
> 11 1 3 1 13.1  36.3
> 12 2 3 1 23.1  66.3
> 13 3 3 1 33.1  96.3
> 14 4 3 1 43.1 126.3
> 15 5 3 1 53.1 156.3
> 16 1 1 2 11.2  11.2
> 17 2 1 2 21.2  21.2
> 18 3 1 2 31.2  31.2
> 19 4 1 2 41.2  41.2
> 20 5 1 2 51.2  51.2
> 21 1 2 2 12.2  23.4
> 22 2 2 2 22.2  43.4
> 23 3 2 2 32.2  63.4
> 24 4 2 2 42.2  83.4
> 25 5 2 2 52.2 103.4
> 26 1 3 2 13.2  36.6
> 27 2 3 2 23.2  66.6
> 28 3 3 2 33.2  96.6
> 29 4 3 2 43.2 126.6
> 30 5 3 2 53.2 156.6
>>
>
>
> On Tue, Dec 7, 2010 at 6:39 AM, Gerrit Draisma<gdraisma at xs4all.nl>  wrote:
>> Dear R-users,
>>
>> I have a dataset with categories and numbers.
>> I would like to compute and add cumulative numbers
>> to the dataset.
>> I do not understand the structure of by(...) or
>> tapply(...) output enough to handle it.
>>
>> Here a small example
>> --------------
>> d<-expand.grid(a=1:5,b=1:3,c=1:2)
>> d$n = 10 * d$a + d$b +0.1* d$c
>> Sn<-by(d$n,list(d$a,d$c),cumsum)
>> str(Sn)
>> ---------
>> List of 10
>>   $ : num [1:3] 11.1 23.2 36.3
>>   $ : num [1:3] 21.1 43.2 66.3
>>   $ : num [1:3] 31.1 63.2 96.3
>>   $ : num [1:3]  41.1  83.2 126.3
>>   $ : num [1:3]  51.1 103.2 156.3
>>   $ : num [1:3] 11.2 23.4 36.6
>>   $ : num [1:3] 21.2 43.4 66.6
>>   $ : num [1:3] 31.2 63.4 96.6
>>   $ : num [1:3]  41.2  83.4 126.6
>>   $ : num [1:3]  51.2 103.4 156.6
>>   - attr(*, "dim")= int [1:2] 5 2
>>   - attr(*, "dimnames")=List of 2
>>   ..$ : chr [1:5] "1" "2" "3" "4" ...
>>   ..$ : chr [1:2] "1" "2"
>>   - attr(*, "call")= language by.default(data = d$n, INDICES = list(d$a,
>> d$c), FUN = cumsum)
>>   - attr(*, "class")= chr "by
>> ---------
>> # these give (a) lists of one numerical vector(a)
>> Sn[5,2]
>> Sn[cbind(d$a,d$c)]
>> # how to access the individual cumsum values?
>> # and assign them to d$Sn?
>> --------------
>>
>> Thanks,
>> Gerrit.
>>
>> ---
>> Gerrit Draisma
>> Department of Public Health
>> Erasmus MC, University Medical Center Rotterdam
>> Room AE-235
>> P.O. Box 2040 3000 CA  Rotterdam The Netherlands
>> Phone: +31 10 7043787 Fax: +31 10 7038474
>> http://mgzlx4.erasmusmc.nl/pwp/?gdraisma
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>



More information about the R-help mailing list