[R] Efficient access to elements of a list of lists
Henrik Bengtsson
hb at biostat.ucsf.edu
Sun Mar 11 19:34:42 CET 2012
On Sun, Mar 11, 2012 at 9:18 AM, Benilton Carvalho
<beniltoncarvalho at gmail.com> wrote:
> Hi,
>
> I have a long list of lists from which I want to efficiently extract
> and rbind elements. So I'm using the approach below:
>
>
> f <- function(i){
> out <- replicate(5, list(matrix(rnorm(80), nc=20)))
> names(out) <- letters[1:5]
> out
> }
> set.seed(1)
> lst <- lapply(1:1.5e6, f)
> (t0 <- system.time(tmp <- do.call(rbind, lapply(lst, '[[', 'b'))))
>
>
> Is there anything better/faster than the do.call+rbind+lapply combo
> above?
The "[[" function involves method dispatching. You can avoid that by
using .subset2(). That may save you some (micro?)seconds.
Now, if all extracted elements are truly of the same dimensions;
> bList <- lapply(lst, FUN='[[', 'b')
> str(head(bList))
List of 6
$ : num [1:4, 1:20] 0.936 -0.844 -0.221 -0.581 -2.513 ...
$ : num [1:4, 1:20] -0.2618 0.0259 -1.3131 -0.0547 -0.3296 ...
$ : num [1:4, 1:20] -1.589 0.844 -1.121 0.21 -0.846 ...
$ : num [1:4, 1:20] -1.192 -1.268 1.688 -0.295 0.466 ...
$ : num [1:4, 1:20] 2.504 -0.833 -1.751 1.117 -0.775 ...
$ : num [1:4, 1:20] 0.119 -0.313 1.741 0.403 -0.261 ...
then you can avoid the rbind(), by doing an unlist()/dim()/aperm(), e.g.
# Extract 'b' as an 4-by-20-by-1.5e6 array
dim <- dim(bList[[1]]);
n <- length(bList);
bArray <- unlist(bList, use.names=FALSE);
dimA <- c(dim, n);
dim(bArray) <- dimA;
# If you really need a matrix, then...
# Turing into a (4*1.5e6)-by-20 array
dimM <- dim;
dimM[1] <- n*dimM[1];
bMatrix <- aperm(bArray, perm=c(1,3,2));
dim(bMatrix) <- dimM;
You owe me a beer ;)
/Henrik
> On this example, the combo takes roughly 20s on my machine...
> but on the data I'm working with, it takes more than 1 minute... And
> given that I need to repeat the task several times, the cumul. amount
> of time is significant for me.
>
> Thank you for any suggestion/comment,
>
> benilton
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list