[R] rbind and data.frame

james.holtman@convergys.com james.holtman at convergys.com
Fri Dec 7 20:14:58 CET 2001


Heres some timings from a 700MHZ laptop running WIN/2000:

> x.1 <- data.frame(a=integer(85000), b=double(85000), c=character(85000))
> str(x.1)
`data.frame':   85000 obs. of  3 variables:
 $ a: int  0 0 0 0 0 0 0 0 0 0 ...
 $ b: num  0 0 0 0 0 0 0 0 0 0 ...
 $ c: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#
# loading up a variable with a vector takes very little time
#
> system.time(x.1$a <- 1:85000)
[1] 0.03 0.00 0.03   NA   NA
> str(x.1)
`data.frame':   85000 obs. of  3 variables:
 $ a: int  1 2 3 4 5 6 7 8 9 10 ...
 $ b: num  0 0 0 0 0 0 0 0 0 0 ...
 $ c: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#
# a 'for' loop by itself is only 0.3 seconds
#
> system.time(for (i in 1:85000)invisible(1))
[1] 0.30 0.00 0.31   NA   NA
#
# it takes me 5 seconds to initialize 85,000 of a variable, so I would
assume
# it would depend on how many and what type.  If 'factors', I would assume
you would
# declare those as 'character' and then convert to 'factor' at the end.
# so it seems fast; is there something I am missing?
#
> system.time(for (i in 1:85000) x.1$a[i] <- i)
[1] 5.12 0.04 5.22   NA   NA
>




"Liaw, Andy" <andy_liaw at merck.com>@stat.math.ethz.ch on 12/07/2001 10:32:31

Sent by:  owner-r-help at stat.math.ethz.ch


To:   r-help at stat.math.ethz.ch
cc:
Subject:  RE: [R] rbind and data.frame


Are you sure that the time difference is *only* in creating the data frame,
rather than other computations in the loop?

Andy

> -----Original Message-----
> From: Göran Broström [mailto:gb at stat.umu.se]
> Sent: Friday, December 07, 2001 7:25 AM
> To: Prof Brian Ripley
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] rbind and data.frame
>
>
> On Fri, 7 Dec 2001, Prof Brian Ripley wrote:
>
> > On Fri, 7 Dec 2001, [iso-8859-1] Göran Broström wrote:
> >
> > > On Wed, 5 Dec 2001, Göran Broström wrote:
> > >
> > > [...]
> > >
> > > > My real problem is how to create a data frame in a
> sequentially growing
> > > > manner, when I know the final size (no of cases). I
> want to avoid to
> > > > call 'rbind' many times, and instead create an 'empty'
> data frame in
> > > > one call, and then fill it. Are there better ways of doing this?
> > >
> > > Got no answer to this one, so I provide one myself:
> >
> > The usual answer is to create a data frame of the desired size and
> > populate it via indexing.  That's in some books I know!
>
> I know that book too (thanks!). I did what you suggest, and
> that took 7
> hours to run. Definitely.
>
> Göran
>
> > >
> > > The answer is: Yes, definitely. I did this, with pure  R
> code, and
> > > created a new data frame with around 58000 records. It
> took 7 hours to
> > > run. I then did it with compiled code (Fortran), and that
> made a slight
> > > difference:  It took 4.8 seconds(!).
> > >
> > > Göran
> > >
> > >
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-.-.-.-.-.-.-
> > > r-help mailing list -- Read
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > > Send "info", "help", or "[un]subscribe"
> > > (in the "body", not the subject !)  To:
> r-help-request at stat.math.ethz.ch
> > >
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._._._._._._._._
> > >
> >
> >
>
> --
>  Göran Broström                      tel: +46 90 786 5223
>  professor                           fax: +46 90 786 6614
>  Department of Statistics            http://www.stat.umu.se/egna/gb/
>  Umeå University
>  SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-.-.-.-.-.-.-
> r-help mailing list -- Read
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To:
> r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._._._._._._._._
>

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._._



--

NOTICE:  The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or
entity to which it is directed and may contain information that is
privileged or otherwise confidential.  If you have received this electronic
mail transmission in error, please delete it from your system without
copying or forwarding it, and notify the sender of the error by reply email
or by telephone (collect), so that the sender's address records can be
corrected.


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list