[R] Freeing up memory in R
Martin Maechler
maechler at stat.math.ethz.ch
Tue May 13 10:05:48 CEST 2014
>>>>> Zilefac Elvis <zilefacelvis at yahoo.com>
>>>>> on Mon, 12 May 2014 15:01:49 -0700 writes:
> Hi,
> I will like to free up memory in R and make my program execute faster. At the moment, MY PROGRAM IS VERY SLOW probably due to memory issues. Here is sample data (Rcode is found at the end) from one simulation(I have 1000 such files to process):
> list(c("1971 1 1GGG1 0.00 -3.68 -0.29", "1971 1 1GGG2 0.00 -8.31 0.81",
> "1971 1 1GGG3 0.00-10.69 5.69", "1971 1 1GGG4 1.78 -6.96 -2.20",
> "1971 1 1GGG5 2.64 -9.48 9.20", "1971 1 1GGG6 0.00 -9.74 3.73",
> "1971 1 1GGG7 0.00 -8.49 3.58", "1971 1 1GGG8 0.00 -2.78 -2.92",
> "1971 1 1GGG9 0.00 -9.30 0.63", "1971 1 1GG10 4.87 -5.59 3.11",
> "1971 1 1GG11 0.10-12.04 10.80", "1971 1 1GG12 0.00 -5.24 -0.43",
> "1971 1 1GG13 0.00 -8.82 2.88", "1971 1 1GG14 0.00-11.10 14.50",
> "1971 1 1GG15 0.00 -5.54 10.12", "1971 1 1GG16 0.00 -4.54 10.48",
> "1971 1 1GG17 0.00 1.68 17.28", "1971 1 1GG18 0.00 -5.79 6.64",
> "1971 1 1GG19 0.00 -5.27 14.29", "1971 1 1GG20 0.00 -8.93 9.60",
> "1971 1 1GG21 5.29 1.30 15.62", "1971 1 1GG22 0.00 -2.50 19.20",
> "1971 1 1GG23 0.00 -7.04 15.73", "1971 1 1GG24 0.00 -8.53 11.60",
> "1971 1 1GG25 0.00 -0.82 10.33", "1971 1 1GG26 0.00 -6.28 21.58",
[.............]
> ))
> Here is a code for processing a thousand of these kind of files:
> #===================================================================================================================
> lst1Sub <- data_above
> lst2 <- lapply(lst1Sub,function(x) {dateSite <- gsub("(.*G.{3}).*","\\1",x);
> dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),stringsAsFactors=FALSE);
> Sims <- str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <- gsub("(.*)([- ][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1 \\2", Sims[grep("\\d+-",Sims)]));
> Sims1 <- read.table(text=Sims,header=FALSE); names(Sims1) <- c("Precipitation", "Tmin", "Tmax");dat2 <- cbind(dat1,Sims1)})
> #=========================================================================================================================
> 1) Please use this code to free up memory considering that I am working on 1000 files, so I am in for speed.
> 2) Is there a faster way of doing the same task as above? My data files are simulated in FORTRAN and read as:
> data.frame(Day=as.numeric(substr(rain.data,7,8)),
> Month=as.numeric(substr(rain.data,5,6)),
> Year=as.numeric(substr(rain.data,1,4)),
> Site=substr(rain.data,9,12),
> Precip=as.numeric(substr(rain.data,13,18)),
> Tmin=as.numeric(substr(rain.data,19,24)),
> Tmax=as.numeric(substr(rain.data,25,30)))
> #### Day occupies position 7 and 8, Month occupies position 5 and 6, Year occupies position 1 to 4 and so on for site, precip, ####tmin and tmax
Given that your data file has things "at position <n>",
I think you should use
read.fwf() instead of read.table()
and then you might not need all the string manipulations you do
above.
Martin
> Thanks for your great help.
> Zilefac.
More information about the R-help
mailing list