[R] how to bind lists recursively
John Fox
jfox at mcmaster.ca
Wed May 28 19:13:52 CEST 2008
Dear Brian,
Thanks for the explanation -- it makes sense of what I've observed in a
range of problems. I think that the bottom line is that in typical problems
involving lists, pre-allocation of the pointers won't make much
(proportional) difference to the total time.
For curiosity, I modified the example so that the list of matrices is of
length 10000 rather than 1000; repeating the problem 10 times, I got the
following average timings:
Pre-allocating the list:
user.self sys.self elapsed
26.789 0.000 26.828
Not pre-allocating the list
user.self sys.self elapsed
27.223 0.000 27.269
Regards,
John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of Prof Brian Ripley
> Sent: May-28-08 9:12 AM
> To: John Fox
> Cc: r-help at r-project.org; Bill.Venables at csiro.au
> Subject: Re: [R] how to bind lists recursively
>
> On Wed, 28 May 2008, John Fox wrote:
>
> > Dear Brian and Bill,
> >
> > Here's an interesting contrasting example (taken from this month's Help
> Desk
> > column in R News, which Bill has already seen), first verifying the
> relative
> > timings for Brian's example:
> >
> >> system.time({
> > + a <- vector("list", 10001)
> > + for(i in 0:10000) a[[i+1]] <- i
> > + })
> > user system elapsed
> > 0.03 0.00 0.03
> >
> >> system.time({
> > + a <- list()
> > + for(i in 0:10000) a[[i+1]] <- i
> > + })
> > user system elapsed
> > 0.47 0.02 0.49
> >
> >> system.time({
> > + matrices <- vector(mode="list", length=1000)
> > + for (i in 1:1000)
> > + matrices[[i]] <- matrix(rnorm(10000), 100, 100)
> > + })
> > user system elapsed
> > 2.59 0.00 2.61
> >
> >> system.time({
> > + matrices <- list()
> > + for (i in 1:1000)
> > + matrices[[i]] <- matrix(rnorm(10000), 100, 100)
> > + })
> > user system elapsed
> > 2.61 0.00 2.60
> >
> > Brian will, I'm sure, have the proper explanation, but I suppose that
the
> > NULL elements in the initialized list in the first example provide
> > sufficient space for the integers that eventually get stored there,
while
> > that's not the case for the matrices in the second example. I believe
that
> > the second example is more typical of what one would actually put in a
> list.
>
> Your explanation is incorrect as a list is just a header plus a set of
> SEXP pointers, and is never used to store data directly.
>
> The main point is that with only 1000 elements, the difference in
> re-allocating the list on your system is only going to be about (0.49 -
> 0.04)/10 ~ 0.04, too small to time accurately (especially under Windows).
> So finding 0.02 is not really contrasting.
>
> The point that pre-allocation may only help when list allocation is
> taking an appreciable part of the time is a sound one.
>
> Note that because of GC tuning such things are very variable. Running
> repeatedly either version on my Linux box gets timings from about 3.8 down
> to 2.5 seconds -- and after settling down to 2.5-2.7s, the 13th run took
> 3.2s.
>
> >
> > Regards,
> > John
> >
> > ------------------------------
> > John Fox, Professor
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario, Canada
> > web: socserv.mcmaster.ca/jfox
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]
> > On
> >> Behalf Of Prof Brian Ripley
> >> Sent: May-28-08 2:04 AM
> >> To: Bill.Venables at csiro.au
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] how to bind lists recursively
> >>
> >> But pre-allocation still helps
> >>
> >> a <- vector("list", 10001)
> >> for(i in 0:10000) a[[i+1]] <- i
> >>
> >> gives
> >> user system elapsed
> >> 0.042 0.001 0.057
> >>
> >> on my system, against
> >> user system elapsed
> >> 0.743 0.000 1.907
> >>
> >> for Bill's 'best' solution.
> >>
> >> On Wed, 28 May 2008, Bill.Venables at csiro.au wrote:
> >>
> >>> This is somewhat subtle.
> >>>
> >>> Rolf's solution (here corrected to...)
> >>>
> >>> a <- list()
> >>> for(i in 0:10000) a[[i+1]] <- i
> >>>
> >>> is the best of the loop solutions (or at least the best I know of).
The
> >>> apparently similar
> >>>
> >>> a <- list(0)
> >>> for(i in 1:10000) a <- c(a, list(i))
> >>>
> >>> will take a lot longer, although the result is the same. For example:
> >>>
> >>>> system.time({
> >>> a <- list()
> >>> for(i in 0:10000) a[[i+1]] <- i
> >>> })
> >>> user system elapsed
> >>> 0.59 0.00 0.59
> >>>> system.time({
> >>> a <- list(0)
> >>> for(i in 1:10000) a <- c(a, list(i))
> >>> })
> >>> user system elapsed
> >>> 6.87 0.00 6.89
> >>>>
> >>>
> >>> That's a factor of about 11 times as long. The best of the lot is
> >>>
> >>> a <- as.list(0:10000)
> >>>
> >>> of course, but this has problems with generalisation, (which everyone
> >>> suspects is going to be needed here...).
> >>>
> >>> Bill Venables.
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]
> >>> On Behalf Of Rolf Turner
> >>> Sent: Wednesday, 28 May 2008 1:02 PM
> >>> To: Daniel Yang
> >>> Cc: r-help at r-project.org
> >>> Subject: Re: [R] how to bind lists recursively
> >>>
> >>>
> >>> On 28/05/2008, at 2:43 PM, Daniel Yang wrote:
> >>>
> >>>> Dear all,
> >>>>
> >>>> I want to create a list that contains 0,1,2,3, ..., 10000 as its
> >>>> elements. I used the following code, which apparently doesn't work
> >>>> very well.
> >>>>
> >>>> a <- 0
> >>>> for(i in 1:10000) {
> >>>> a <- list(a, i)
> >>>> }
> >>>>
> >>>> The result is not what I wanted. So how to create the bind lists
> >>>> recursively so that the last element would be the newly added one
> >>>> while the previous elements all remain the same?
> >>>
> >>> a <- list()
> >>> for(i in 1:10000) a[[i]] <- i
> >>>
> >>> (The word ``bind'' is inappropriate.)
> >>>
> >>> cheers,
> >>>
> >>> Rolf Turner
> >>
> >> --
> >> Brian D. Ripley, ripley at stats.ox.ac.uk
> >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford, Tel: +44 1865 272861 (self)
> >> 1 South Parks Road, +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UK Fax: +44 1865 272595
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list