[R] excessive (?) memory utilization by package foreign when reading SAS xport file
djnordlund at frontier.com
Fri Dec 17 07:15:34 CET 2010
Someone recently asked about reading the BRFSS data into R. The answer was as simple as using the foreign package to read the SAS xport dataset. I have been asked to assist someone in using R and the survey package to analyze the BRFSS survey. At work, I have a 64-bit system running Windows 7 Enterprise edition, with 12GB of ram, and have 64-bit R-2.12.0 installed from CRAN. The xport file is about 830 MB. I executed the following code to read the file
brfss09 <- read.xport("C:/Users/djnordlund/Documents/R-examples/BRFSS/cdbrfs09.xpt")
The file was read in and I was able to begin playing with it. I then tried to read the file on my home system which is a 64-bit system running Windows 7 Professional edition with 8GB of ram and I also have 64-bit R-2.12.0 installed from CRAN. I tried to read in the data using the same syntax as above, and after a couple of minutes I received an error message that a 3.3 MB vector could not be allocated and that I had used up all available memory. When I ran gc(), it reported that the maximum amount of memory that had been used was over 7GB. I looked at my work computer which was successful, and it had used 9.5 GB of ram when reading the BRFSS xport file.
Is it expected that read.xport() requires much more memory to read a file than is required to store it in memory? If necessary, I can install a database as a back-end and read in pieces for analysis, but I guess I was surprised by the memory requirement of using read.xport(). Or am I doing something wrong.
Bothell, WA USA
More information about the R-help