[R] Freeing up memory in R

Martin Maechler maechler at stat.math.ethz.ch
Tue May 13 10:05:48 CEST 2014


>>>>> Zilefac Elvis <zilefacelvis at yahoo.com>
>>>>>     on Mon, 12 May 2014 15:01:49 -0700 writes:

    > Hi,
    > I will like to free up memory in R and make my program execute faster. At the moment, MY PROGRAM IS VERY SLOW probably due to memory issues. Here is sample data (Rcode is found at the end) from one simulation(I have 1000 such files to process): 



    > list(c("1971 1 1GGG1  0.00 -3.68 -0.29", "1971 1 1GGG2  0.00 -8.31  0.81", 
    > "1971 1 1GGG3  0.00-10.69  5.69", "1971 1 1GGG4  1.78 -6.96 -2.20", 
    > "1971 1 1GGG5  2.64 -9.48  9.20", "1971 1 1GGG6  0.00 -9.74  3.73", 
    > "1971 1 1GGG7  0.00 -8.49  3.58", "1971 1 1GGG8  0.00 -2.78 -2.92", 
    > "1971 1 1GGG9  0.00 -9.30  0.63", "1971 1 1GG10  4.87 -5.59  3.11", 
    > "1971 1 1GG11  0.10-12.04 10.80", "1971 1 1GG12  0.00 -5.24 -0.43", 
    > "1971 1 1GG13  0.00 -8.82  2.88", "1971 1 1GG14  0.00-11.10 14.50", 
    > "1971 1 1GG15  0.00 -5.54 10.12", "1971 1 1GG16  0.00 -4.54 10.48", 
    > "1971 1 1GG17  0.00  1.68 17.28", "1971 1 1GG18  0.00 -5.79  6.64", 
    > "1971 1 1GG19  0.00 -5.27 14.29", "1971 1 1GG20  0.00 -8.93  9.60", 
    > "1971 1 1GG21  5.29  1.30 15.62", "1971 1 1GG22  0.00 -2.50 19.20", 
    > "1971 1 1GG23  0.00 -7.04 15.73", "1971 1 1GG24  0.00 -8.53 11.60", 
    > "1971 1 1GG25  0.00 -0.82 10.33", "1971 1 1GG26  0.00 -6.28 21.58", 
[.............]
    > ))



    > Here is a code for processing a thousand of these kind of files:
    > #===================================================================================================================
    > lst1Sub <- data_above

    > lst2 <- lapply(lst1Sub,function(x) {dateSite <- gsub("(.*G.{3}).*","\\1",x); 
    >                                     dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),stringsAsFactors=FALSE); 
    >                                     Sims <- str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <- gsub("(.*)([- ][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1 \\2", Sims[grep("\\d+-",Sims)])); 
    >                                     Sims1 <- read.table(text=Sims,header=FALSE); names(Sims1) <- c("Precipitation", "Tmin", "Tmax");dat2 <- cbind(dat1,Sims1)}) 
    > #=========================================================================================================================

    > 1) Please use this code to free up memory considering that I am working on 1000 files, so I am in for speed.
    > 2) Is there a faster way of doing the same task as above? My data files are simulated in FORTRAN and read as:

    > data.frame(Day=as.numeric(substr(rain.data,7,8)), 
    >                          Month=as.numeric(substr(rain.data,5,6)), 
    >                          Year=as.numeric(substr(rain.data,1,4)), 
    >                          Site=substr(rain.data,9,12), 
    >                         Precip=as.numeric(substr(rain.data,13,18)), 
    >                         Tmin=as.numeric(substr(rain.data,19,24)), 
    >                         Tmax=as.numeric(substr(rain.data,25,30))) 
    > #### Day occupies position 7 and 8, Month occupies position 5 and 6, Year occupies position 1 to 4 and so on for site, precip, ####tmin and tmax

Given that your data file has things  "at position <n>",
I think you should use

read.fwf()  instead  of  read.table()  
and then you might not need all the string manipulations you do
above.

Martin


    > Thanks for your great help.
    > Zilefac.



More information about the R-help mailing list