[R] filehash for big data

michael curran michcurran at yahoo.com
Sun Jan 2 20:14:22 CET 2011


Hi all,

I am trying to use the filehash library to analyze a 5M by 20 matrix with both 
double and string data types.  


After consulting a few tutorials online, it seems as though one needs to first 
read the data into R; then create an R object; and then assign that object a 
location in my computer via filehash. It seems like the benefit of this is 
minimizing memory allocation when running subsequent analysis (e.g., descriptive 

statistics, regressions, etc.) . 


My question is: what happens if R chokes when trying to read in the data (i.e., 
step 1)? Is there another library I can use to get the data read in or, 
alternatively, am I misunderstanding the complete functionality of the filehash 
library and what it can do? 


Apologies if this a basic question--usually I work with considerably smaller 
data frames and don't have much experience with memory issues and R. 


Thanks in advance for any advice/pointers.

Best, Mike



More information about the R-help mailing list