[R] How do I use "tapply" in this case ?

Bert Gunter gunter.berton at gene.com
Fri Feb 5 07:50:48 CET 2010


Folks:

You can make use of matrix subscripting and avoid R level loops and applys
altogether. This will end up being many times faster.

Here's your original code:

 Z=matrix(rnorm(20), nrow=4)
index=replicate(4, sample(1:5, 3))
P=4
tmpr=list()
for (i in 1:P)
{
  tmp = Z[i,index[,i]]
  tmpr[[i]]=tmp
}

for clarity, here's the index matrix I got:
> index
     [,1] [,2] [,3] [,4]
[1,]    5    1    2    3
[2,]    2    2    4    4
[3,]    1    5    5    5

Here's what I got for tmpr when I used your code:

> tmpr
[[1]]
[1] -0.6246316 -0.8695538 -0.4136176

[[2]]
[1]  0.02885345 -1.89837071  0.43195955

[[3]]
[1]  0.2453368 -0.1788287 -0.6620405

[[4]]
[1] -0.87077697 -1.62554371  0.04464793

So the ith component of tmpr is is just what the indices in the ith column
of index pick out of the ith row of Z. That is, the first component of tmpr
are the (1,5), (1,2), and (1,1) elements of Z. Matrix (in general,
array)indexing -- read the man page for "[" carefully: it's documented in
the "Matrices and Arrays" section -- allow you to "stack" these pairs (for
n-dim arrays,n-tuples) row-wise into a matrix and use this matrix as an
index:

> Z[cbind(c(1,1,1),index[,1])]
[1] -0.6246316 -0.8695538 -0.4136176

So you can do everything at once by (making use of R's columnwise storage of
arrays) as:

result <- Z[cbind(as.vector(col(index)), as.vector(index))]

 which gives:

 [1] -0.62463163 -0.86955383 -0.41361765  0.02885345 -1.89837071  0.43195955
0.24533679
 [8] -0.17882867 -0.66204048 -0.87077697 -1.62554371  0.04464793

Note that this vector is the same as: unlist(tmpr). So you can turn it into
a matrix e.g. where column i is the ith component of tmpr by:

dim(result) <- dim(index)

As I said, for large problems, this should be wayyyyy faster than explicit
loops or the hidden (and optimized, but still) loops of apply functions.


Bert Gunter
Genentech Nonclinical Statistics




-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of RICHARD M. HEIBERGER
Sent: Thursday, February 04, 2010 9:10 PM
To: Carrie Li
Cc: r-help
Subject: Re: [R] How do I use "tapply" in this case ?

lapply(1:4, function(i, x, y) {x[i,y[,1]]}, Z, index ) ## reproduces
your results

sapply(1:4, function(i, x, y) {x[i,y[,1]]}, Z, index ) ## collapses
your list into a set of columns

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list