# [R] fast subsetting of lists in lists

Henrik Bengtsson hb at biostat.ucsf.edu
Tue Dec 7 19:11:30 CET 2010

```First, subset 'test' once, e.g.

testT <- test[1:3];

and then use sapply() on that, e.g.

val <- sapply(testT, FUN=function (x) { x\$a })

Then you can avoid one level of function calls, by

val <- sapply(testT, FUN="[[", "a")

Second, there is some overhead in "[[", "\$" etc.  You can use
.subset2() to avoid this, e.g.

val <- sapply(testT, FUN=.subset2, "a")

Third, it may be that using sapply() to structure you results is a bit
overkill.  If you know that the 'a' element is always of the same
dimension, you can do it yourself, e.g.

val <- lapply(testT, FUN=.subset2, "a")
val <- unlist(val, use.names=FALSE)   # use.names=FALSE is much faster than TRUE

See what that does

/Henrik

On Tue, Dec 7, 2010 at 6:47 AM, Alexander Senger
<senger at physik.hu-berlin.de> wrote:
> Hello,
>
>
> my data is contained in nested lists (which seems not necessarily to be
> the best approach). What I need is a fast way to get subsets from the data.
>
> An example:
>
> test <- list(list(a = 1, b = 2, c = 3), list(a = 4, b = 5, c = 6),
> list(a = 7, b = 8, c = 9))
>
> Now I would like to have all values in the named variables "a", that is
> the vector c(1, 4, 7). The best I could come up with is:
>
> val <- sapply(1:3, function (i) {test[[i]]\$a})
>
> which is unfortunately not very fast. According to R-inferno this is due
> to the fact that apply and its derivates do looping in R rather than
> rely on C-subroutines as the common [-operator.
>
> Does someone now a trick to do the same as above with the faster
> built-in subsetting? Something like:
>
> test[<somesubsettingmagic>]
>
>
>
>
> Alex
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help