[R] memory problem in exporting data frame

Henrik Bengtsson hb at maths.lth.se
Tue Sep 9 09:45:34 CEST 2003


Hi, I replied to a related question yesterday (Mon Sept 8, 2003) with
subject "RE: [R] cannot allocate vector of size...". That was as also
about running low of memory, but about *reading* data from file and not
writing. However, the problem is likely to be due to the same thing.

You pass a large object to a function via an argument, an argument which
is then changed inside the function (in your case write.table() is doing
x <- as.matrix(x)). As long as the argument is only read, R is clever
not to create a copy of it (pass by reference if read-only), but as soon
as you change it, it is creating a local copy of it (pass by value).
Hence, now you have your original 'xxx' object plus a local copy
"inside" the function. This is likely to be your problem. 

You can do the work around that Patrick Burns suggest and improve it
slightly, if can you do not need the 'xxx' variable anymore, you can do
'xxx <- as.matrix(xxx)'. A better approach, as you suggest yourself,
except from doing it row by row, is to write your dataframe block by
block with a reasonable block size. This can of course be done using a
for loop and write.table(), but you will do better if you look at the
code in write.table() and avoid the doing the same overhead work in each
step.

Finally and FYI, you might be able to shrink your original data frame by
considering the following

i <- as.integer(1:1000)
d <- as.double(i)
df1 <- data.frame(i=i, d=d)
df2 <- data.frame(i=i, d=i)
object.size(df1)  # 24392 bytes
object.size(df2)  # 20392 bytes

However, note that when doing x <- as.matrix(x) (as write.table() does),
will coerce the data into *one* data type (because it is a matrix). In
other words, the only thing you will gain is a smaller 'xxx' object.

Best wishes

Henrik Bengtsson
Lund University

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Thomas 
> W Blackwell
> Sent: den 9 september 2003 01:28
> To: array chip
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] memory problem in exporting data frame
> 
> 
> 
> Simplest is to save your workspace using  save.image(),
> then delete a bunch of large objects other than the data
> frame that you want to export, and run  write.table()
> again, now that you've made space for it.  A quick calc
> shows  17000 x 400 x 8 = 55 Mb, and that's just the size
> of the object that chokes R below.
> 
> -  tom blackwell  -  u michigan medical school  -  ann arbor  -
> 
> On Mon, 8 Sep 2003, array chip wrote:
> 
> > I am having trouble of exporting a large data frame
> > out of R to be used in other purpose. The data frame
> > is numeric with size 17000x400. It takes a quite some
> > time to start R as well. my computer has 1GB RAM. I
> > used the following command to write the data frame to
> > a text file and got the error message below:
> >
> > > write.table(xxx, "C:\\xxx", sep="\t",
> > row.names=FALSE,col.names=FALSE,quote=FALSE)
> >
> > Error: cannot allocate vector of size 55750 Kb
> > In addition: Warning message:
> > Reached total allocation of 1023Mb: see
> > help(memory.size)
> >
> > I tried to increase the memory size by
> > memory.size(size=), but it seems running the above
> > command takes forever.
> >
> > what can I do with this error message to get the data
> > out?
> >
> > Thanks
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> 
>




More information about the R-help mailing list