[R] Reading large files
Vadlamani, Satish {FLNA}
SATISH.VADLAMANI at fritolay.com
Fri Feb 5 02:27:53 CET 2010
Folks:
I am trying to read in a large file. Definition of large is:
Number of lines: 333, 250
Size: 850 MB
The maching is a dual core intel, with 4 GB RAM and nothing else running on it. I read the previous threads on read.fwf and did not see any conclusive statements on how to read fast. Example record and R code given below. I was hoping to purchase a better machine and do analysis with larger datasets - but these preliminary results do not look good.
Does anyone have any experience with large files (> 1GB) and using them with Revolution-R?
Thanks.
Satish
Example Code
key_vec <- c(1,3,3,4,2,8,8,2,2,3,2,2,1,3,3,3,3,9)
key_names <- c("allgeo","area1","zone","dist","ccust1","whse","bindc","ccust2","account","area2","ccust3","customer","allprod","cat","bu","class","size","bdc")
key_info <- data.frame(key_vec,key_names)
col_names <- c(key_names,sas_time$week)
num_buckets <- rep(12,209)
width_vec = c(key_vec,num_buckets)
col_classes<-c(rep("factor",18),rep("numeric",209))
#threewkoutstat <- read.fwf(file="3wkoutstatfcst_file02.dat",widths=width_vec,header=FALSE,colClasses=col_classes,n=100)
threewkoutstat <- read.fwf(file="3wkoutstatfcst_file02.dat",widths=width_vec,header=FALSE,colClasses=col_classes)
names(threewkoutstat) <- col_names
Example record (only one record pasted below)
A004001003799000049250000492599990049999A001002002015002015009 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.60 0.60 0.60 0.70 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
More information about the R-help
mailing list