[R] handling a lot of data

Petr Kurtin kurtin at avast.com
Mon Jan 30 09:54:07 CET 2012


Hi,

I have got a lot of SPSS data for years 1993-2010. I load all data into
lists so I can easily index the values over the years. Unfortunately loaded
data occupy quite a lot of memory (10Gb) - so my question is, what's the
best approach to work with big data files? Can R get a value from the file
data without full loading into memory? How can a slower computer with not
enough memory work with such data?

I use the following commands:

data1993 = vector("list", 4);
data1993[[1]] = read.spss(...)  # first trimester
data1993[[2]] = read.spss(...)  # second trimester
...
data_all = vector("list", 17);
data_all[[1993]] = data1993;
...

and indexing, e.g.: data_all[[1993]][[1]]$DISTRICT, etc.

Thanks,
Petr Kurtin



More information about the R-help mailing list