[Rd] Any interest in "merge" and "by" implementations specifically for sorted data?
Kevin B. Hendricks
kevin.hendricks at sympatico.ca
Sat Jul 29 06:32:21 CEST 2006
Hi Bill,
>>> sum : igroupSums
Okay, after thinking about this ...
# assumes i is the small integer factor with n levels
# v is some long vector
# no sorting required
igroupSums <- function(v,i) {
sums <- rep(0,max(i))
for (j in 1:length(v)) {
sums[[i[[j]]]] <- sums[[i[[j]]]] + v[[j]]
}
sums
}
if written in fortran or c might be faster than using split. It is
at least just linear in time with the length of vector v. This
approach could be easily made parallel to t threads simply by picking
t starting points someplace along v and running this routine in
parallel on each piece. You could even do it without thread locking
if "sums" elements can be accessed atomically or by creating multiple
copies of "sums" (one for each piece) and then doing a final addition.
I still think I am missing some obvious way to do this but ...
Am I thinking along the right lines?
Kevin
More information about the R-devel
mailing list