[R] Memory Problems with a Simple Bootstrap

Tom La Bone booboo at gforcecable.com
Fri Aug 1 21:36:42 CEST 2008


Here it is with the gc() in a print statement.

Tom

> 
> library(boot)
> setwd("C:/Documents and Settings/Tom/Desktop")   
> 
> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
> 
> per95 <- function( annual.data, b.index) {
+   sample.data <- annual.data[b.index,]
+   return(quantile(sample.data$Result,probs=c(0.95))) }
> 
> m <- 10000
> for (i in 1:39) {
+   annual.data <- data.in[data.in$Year == (i+1949),]
+   B <- boot(data=annual.data,statistic=per95,R=m)
+   print(i)  
+   print(gc())  
+   print(object.size(B))
+   print(memory.size())
+ }
[1] 1
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 145517  3.9     350000  9.4   350000  9.4
Vcells 304805  2.4    2602013 19.9  2841664 21.7
[1] 90352
[1] 12.35812
[1] 2
         used (Mb) gc trigger (Mb) max used  (Mb)
Ncells 145540  3.9     350000  9.4   350000   9.4
Vcells 309041  2.4   12977679 99.1 15259760 116.5
[1] 111032
[1] 12.39814
[1] 3
         used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 145540  3.9     350000   9.4   350000   9.4
Vcells 318147  2.5   35277418 269.2 41833896 319.2
[1] 155544
[1] 12.49432
[1] 4
         used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 145540  3.9     350000   9.4   350000   9.4
Vcells 318867  2.5   37046714 282.7 43935337 335.3
[1] 159064
[1] 11.10362
[1] 5
         used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 145540  3.9     350000   9.4   350000   9.4
Vcells 336129  2.6   79348305 605.4 94192718 718.7
[1] 243456
[1] 11.2296
[1] 6
         used (Mb) gc trigger  (Mb)  max used  (Mb)
Ncells 145540  3.9     350000   9.4    350000   9.4
Vcells 343725  2.7   97971904 747.5 116530118 889.1
[1] 280592
[1] 12.75431
[1] 7
         used (Mb) gc trigger  (Mb)  max used  (Mb)
Ncells 145540  3.9     350000   9.4    350000   9.4
Vcells 348189  2.7  108915204 831.0 129493067 988.0
[1] 302416
[1] 11.32924
[1] 8
         used (Mb) gc trigger  (Mb)  max used   (Mb)
Ncells 145540  3.9     350000   9.4    350000    9.4
Vcells 351735  2.7  117607222 897.3 139706454 1065.9
[1] 319752
[1] 12.85929
[1] 9
         used (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells 145540  3.9     350000    9.4    350000    9.4
Vcells 358217  2.8  133510676 1018.7 158462984 1209.0
[1] 351448
[1] 11.40765
Error: cannot allocate vector of size 284.4 Mb
> 
> 
> 




jholtman wrote:
> 
> It seems like the objects are reasonable size and the memory size also
> seems reasonable.  That is what I usually go by to see if there are
> large objects in my memory.  If it was showing that R had 1.2GB of
> memory allocated to it, I wonder if there might be a memory leak
> somewhere.
> 
> On Fri, Aug 1, 2008 at 1:36 PM, Tom La Bone <booboo at gforcecable.com>
> wrote:
>>
>> Same problem. The Windows Task Manager indicated that Rgui.exe was using
>> 1,249,722 K of memory when the error occurred. This is R 2.7.1 by the
>> way.
>>
>>> library(boot)
>>> setwd("C:/Documents and Settings/Tom/Desktop")
>>>
>>> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
>>>
>>> per95 <- function( annual.data, b.index) {
>> +   sample.data <- annual.data[b.index,]
>> +   return(quantile(sample.data$Result,probs=c(0.95))) }
>>>
>>> m <- 10000
>>> for (i in 1:39) {
>> +   annual.data <- data.in[data.in$Year == (i+1949),]
>> +   B <- boot(data=annual.data,statistic=per95,R=m)
>> +   gc()
>> +   print(i)
>> +   print(object.size(B))
>> +   print(memory.size())
>> + }
>> [1] 1
>> [1] 90352
>> [1] 12.35335
>> [1] 2
>> [1] 111032
>> [1] 12.39024
>> [1] 3
>> [1] 155544
>> [1] 12.48451
>> [1] 4
>> [1] 159064
>> [1] 11.10526
>> [1] 5
>> [1] 243456
>> [1] 11.23505
>> [1] 6
>> [1] 280592
>> [1] 12.74642
>> [1] 7
>> [1] 302416
>> [1] 11.33087
>> [1] 8
>> [1] 319752
>> [1] 12.84377
>> [1] 9
>> [1] 351448
>> [1] 11.42264
>> Error: cannot allocate vector of size 284.4 Mb
>>>
>>>
>>
>>
>>
>> jholtman wrote:
>>>
>>> Use gc() in the loop to possibly free up any fragmented memory.  You
>>> might also print out the size of B (object.size(B)) since that appears
>>> to be the only variable in your loop that might be growing.
>>>
>>> On Fri, Aug 1, 2008 at 12:09 PM, Tom La Bone <booboo at gforcecable.com>
>>> wrote:
>>>>
>>>>
>>>> I have a data file called inputdata.csv that looks something like this"
>>>>
>>>>          ID     Year    Result Month   Date
>>>> 1       7174    1954   10            3          540301
>>>> 2       7174    1954    4            3          540322
>>>> 3       20924  1967     4           2          670223
>>>> 4       20924  1967   -7            5          670518
>>>> 5       20924  1967   -3            7          670706
>>>> ...
>>>> 67209 ...
>>>>
>>>> i.e., it goes on for 67209 rows (~2 Mb file). When I run the following
>>>> bootstrap session I get the indicated error:
>>>>
>>>>>
>>>>> library(boot)
>>>>> setwd("C:/Documents and Settings/Tom/Desktop")
>>>>>
>>>>> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
>>>>>
>>>>> per95 <- function( annual.data, b.index) {
>>>> +   sample.data <- annual.data[b.index,]
>>>> +   return(quantile(sample.data$Result,probs=c(0.95))) }
>>>>>
>>>>> m <- 10000
>>>>> for (i in 1:39) {
>>>> +   annual.data <- data.in[data.in$Year == (i+1949),]
>>>> +   B <- boot(data=annual.data,statistic=per95,R=m)
>>>> +   print(i)
>>>> +   print(memory.size())
>>>> + }
>>>> [1] 1
>>>> [1] 20.26163
>>>> [1] 2
>>>> [1] 61.6352
>>>> [1] 3
>>>> [1] 134.4187
>>>> [1] 4
>>>> [1] 149.4704
>>>> [1] 5
>>>> [1] 290.3090
>>>> [1] 6
>>>> [1] 376.7017
>>>> [1] 7
>>>> [1] 435.7683
>>>> [1] 8
>>>> [1] 463.7404
>>>> [1] 9
>>>> [1] 497.7946
>>>> Error: cannot allocate vector of size 568.8 Mb
>>>>>
>>>>
>>>> I am running this on a Windows XP Pro machine with 4 Gb of memory. The
>>>> same
>>>> problem occurs when the code is executed on the same box running Ubuntu
>>>> 8.04. Does anyone see any obvious reason why this should run out of
>>>> memory?
>>>> I would be happy to email the data file to anyone who cares to try it
>>>> on
>>>> their computer.
>>>>
>>>> Tom
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18777897.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18779433.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18781306.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list