[R] sum portions of a vector

William Dunlap wdunlap at tibco.com
Mon Dec 10 23:20:26 CET 2012


If you have a large number of small groups the tapply(x, factor, sum) can
be sped up by replacing it with  Rigroup::igroupSums(x, as.integer(factor)), as in:

  > library(Rigroup)
  > x <- 1:1e6
  > fgroup <- factor(c(seq_len(length(x)/2), sample(length(x)/2, size=length(x)/2, replace=TRUE)))
  > # 5*10^5 small groups
  > system.time(v1 <- tapply(x, fgroup, sum))
     user  system elapsed
    3.193   0.020   3.220
  > system.time(v2 <- igroupSums(x, as.integer(fgroup)))
     user  system elapsed
    0.044   0.000   0.046
  > all(v1==v2)
  [1] TRUE

(The igroup<FUN> functions are in S+ and do the fast numerical work for
the group<FUN> functions.  Rigroup looks like it is based on the S+ igroup<FUN>
functions but doesn't not have the 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of David L Carlson
> Sent: Monday, December 10, 2012 12:08 PM
> To: sds at gnu.org; r-help at r-project.org
> Subject: Re: [R] sum portions of a vector
> 
> How about?
> 
> > vec <- 1:10
> > breaks <- c(3,8,10)
> > g <- cut(vec, c(0, breaks))
> > sums <- aggregate(vec, list(g), sum)$x
> > nums <- tapply(vec, g, paste0, collapse="+")
> > results <- paste0(sums, " = ", nums)
> > results
> [1] "6 = 1+2+3"      "30 = 4+5+6+7+8" "19 = 9+10"
> 
> ----------------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > project.org] On Behalf Of Sam Steingold
> > Sent: Monday, December 10, 2012 1:29 PM
> > To: r-help at r-project.org
> > Subject: [R] sum portions of a vector
> >
> > How do I sum portions of a vector into another vector?
> > E.g., for
> > --8<---------------cut here---------------start------------->8---
> > > vec <- 1:10
> > > breaks <- c(3,8,10)
> > --8<---------------cut here---------------end--------------->8---
> > I want to get a vector of length 3 with content
> > --8<---------------cut here---------------start------------->8---
> > 6 = 1+2+3
> > 30 = 4+5+6+7+8
> > 19 = 9+10
> > --8<---------------cut here---------------end--------------->8---
> > Obviously, I could write a loop, but I would rather have a vectorized
> > version.
> > Thanks!
> >
> > --
> > Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X
> > 11.0.11103000
> > http://www.childpsy.net/ http://palestinefacts.org http://ffii.org
> > http://jihadwatch.org http://www.PetitionOnline.com/tap12009/
> > One can find Holy Grail or Higgs boson, but not the second sock.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list