[R] How to union the elements in a list?

Henrik Bengtsson hb at stat.berkeley.edu
Wed Oct 28 21:03:41 CET 2009

unlist(..., use.names=FALSE) is heaps faster than the default
unlist(..., use.names=TRUE), cf.

> z <- split(sample(1000,1e6,rep=TRUE),rep(1:1e5,10))
> system.time(y1 <- Reduce(union,z))
user  system elapsed
5.98    0.00    5.89
> system.time(y2 <- unique(unlist(z)))
user  system elapsed
2.62    0.02    2.51
> system.time(y2b <- unique(unlist(z, use.names=FALSE)))
user  system elapsed
0.03    0.00    0.05
> system.time(y3 <- unique(do.call(c,z)))
user  system elapsed
2.28    0.03    2.37
> identical(y1,y2)
[1] TRUE
> identical(y1,y2b)
[1] TRUE
> identical(y2,y3)
[1] TRUE

/H

On Wed, Oct 28, 2009 at 12:51 PM, Bert Gunter <gunter.berton at gene.com> wrote:
> ... and just for amusement: unique(do.call(c,l))
>
> The do.call and unlist approaches should be faster than Reduce; do.call
> _may_ be marginally faster than unlist. Here's a timing comparison:
>
>
>> z <- split(sample(1000,1e6,rep=TRUE),rep(1:1e5,10))
>> length(z)
> [1] 100000
>
> ## the comparisons:
>
>> system.time(y1 <- Reduce(union,z))
>   user  system elapsed
>   5.02    0.00    5.03
>
>> system.time(y2 <- unique(unlist(z)))
>   user  system elapsed
>   1.92    0.00    1.92
>
>> system.time(y3 <- unique(do.call(c,z)))
>   user  system elapsed
>   1.75    0.00    1.75
>
>> identical(y1,y2)
> [1] TRUE
>> identical(y2,y3)
> [1] TRUE
>
> Obviously, this is unlikely to matter for any reasonable size dataset, but
> maybe it's instructive.
>
> Of course, Reduce wins the RGolf contest  ;-)
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Ben Bolker
> Sent: Wednesday, October 28, 2009 12:27 PM
> To: r-help at r-project.org
> Subject: Re: [R] How to union the elements in a list?
>
>
>
>
> Peng Yu wrote:
>>
>> Suppose that I have a list of vectors. I want to compute the union of
>> all the vectors in the list. I could use 'for' loop to do so. But I'm
>> wondering what would be a better solution that does not need a 'for'
>> loop.
>>
>> l=list(a=c(1,3,4), b=c(1,3,6), c=c(1,3,7), ....)
>>
>>
>
> Reduce(union,l)
>
> --
> View this message in context:
> http://www.nabble.com/How-to-union-the-elements-in-a-list--tp26100375p261006
> 84.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help