[R] summarize a vector

arun smartpink111 at yahoo.com
Sat Aug 11 01:34:35 CEST 2012


HI,

Same result, with data.frame:
dat1<-data.frame(V1=v[1:3],V2=v[4:6],V3=v[7:9],V4=c(v[10],rep(0,2)))
sapply(dat1,cumsum)[3,]
V1 V2 V3 V4 
 6 15 24 10 
 sapply(dat1,sum)
V1 V2 V3 V4 
 6 15 24 10 
A.K.




----- Original Message -----
From: David Winsemius <dwinsemius at comcast.net>
To: Michael Weylandt <michael.weylandt at gmail.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>; "sds at gnu.org" <sds at gnu.org>; Bert Gunter <gunter.berton at gene.com>
Sent: Friday, August 10, 2012 6:58 PM
Subject: Re: [R] summarize a vector


On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote:

> I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well.

What about colSums on a matrix extended with the right number of zeros.

> colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) )
[1]  6 15 24 10

(My experience is that tapply is generally fairly fast anyway, much faster than apply.data.frame. So I do not lump all *apply solutions in the same efficiency category.)

--David.
> 
> Cheers,
> Michael
> 
> On Aug 10, 2012, at 5:07 PM, David Winsemius <dwinsemius at comcast.net> wrote:
> 
>> 
>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>> 
>>> ... or perhaps even simpler:
>>> 
>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>>> sz(1:10,3)
>>> 0  1  2  3
>>> 6 15 24 10
>>> 
>>> Note that this works for k>n, where the previous solution does not.
>>>> sz(1:10,15)
>>> 0
>>> 55
>> 
>> I agree that it is more elegant, but I do not get an error or an unexpected result with my method.
>> 
>>> N=10
>>> k=15
>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>> w
>> 1
>> 55
>> 
>> A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway.
>> 
>> Best;
>> David.
>> 
>>> 
>>> -- Bert
>>> 
>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>>> <dwinsemius at comcast.net> wrote:
>>>> 
>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>> 
>>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>> 
>>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>> 
>>>>> N=10, k=3
>>>>> ===> [6,15,24,10]
>>>>> 
>>>>> I can, of course, iterate:
>>>>> 
>>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>> 
>>>>> 
>>>>> (modulo boundary conditions)
>>>>> but I wonder if there is a better way.
>>>> 
>>>> 
>>>> Well, using v with parentheses instead of square-brackets might not be the
>>>> right way, since v is not a function.
>>>> 
>>>> Consider this alternate (no need to pre-allocate 'w'):
>>>> 
>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>>> w
>>>> 1  2  3  4
>>>> 6 15 24 10
>>>> 
>>>> --
>>>> 
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> 
>>> 
>>> --
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>> 
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>> 
>> David Winsemius, MD
>> Alameda, CA, USA
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list