[R] efficient ways to dynamically grow a dataframe

jim holtman jholtman at gmail.com
Thu Dec 1 15:02:23 CET 2011


First, dataframes can be much slower than matrices, for example, if
you are changing/accessing values a lot.  I would suggest that you use
a matrix since is seems that all your values are numeric.  Allocate a
large empty matrix to start (hopefully as large as you need).  If you
exceed this, you have the option of 'rbind'ing more empty rows on and
continuing.  This might depend on how large your final matrix might be
(you did not state the boundary conditions).

On Thu, Dec 1, 2011 at 6:34 AM, Matteo Richiardi
<matteo.richiardi at unito.it> wrote:
> Hi,
> I'm trying to write a small microsimulation in R: that is, I have a
> dataframe with info on N individuals for the base-year and I have to
> grow it dynamically for T periods:
>
> df = data.frame(
>  id = 1:N,
>  x =....
> )
>
> The most straightforward way to solve the problem that came to my mind
> is to create for every period a new dataframe:
>
> for(t in 1:T){
>  for(i in 1:N){
>  row = data.frame(
>   id = i,
>   t = t,
>   x = ...
>   )
>   df = rbind(df,row)
>  }
> }
>
> This is very inefficient and my pc gets immediately stucked as N is
> raised above some thousands.
> As an alternative, I created an empty dataframe for all the projected
> periods, and then filled it:
>
> df1 = data.frame(
>  id = rep(1:N,T),
>  t = rep(1:T, each = N),
>  x = rep(NA,N*T)
> )
>
> for(t in 1:T){
>  for(i in 1:N){
>  x = ...
>  df1[df1$id==i & df1$t==t,"x"] = x
>  }
> }
> df = rbind(df,df1)
>
> This is also too slow, and my PC gets stucked. I don't want to go for
> a matrix, because I'd loose the column names and everything will
> become too much error-prone.
> Any suggestions on how to do it?
> Thanks in advance,
> Matteo
>
>
>
>
> --
> Matteo Richiardi
> University of Turin
> Faculty of Law
> Department of Economics "Cognetti De Martiis"
> via Po 53, 10124 Torino
> Email: matteo.richiardi at unito.it
> Tel. +39 011 670 3870
> Web page: http://www.personalweb.unito.it/matteo.richiardi/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list