[R] gc() vs memory.profile()

Ross Boylan ross at biostat.ucsf.edu
Sat Dec 28 00:05:46 CET 2013


On Fri, 2013-12-27 at 16:47 -0600, Hadley Wickham wrote:
> For your original case, you may find it more useful to do memory +
> line profiling (e.g. as visualised by
> https://github.com/hadley/lineprof) to figure out what's going on.
> 
> Hadley 
I've been trying memory and line profiling, but memory="stats" or
"tseries" doesn't seem to work in summaryRProf.  I'm not sure if I've
misunderstood something or there are bugs:
 >   Rprof(memory.profiling=TRUE, gc.profiling=TRUE,
line.profiling=TRUE)  
 >   system.time(r <- Reduce(addResults,
bigr[1:10]))                                                                                                                                                                                                                          
    user  system elapsed
   3.408   0.000   3.415
 >   Rprof(NULL)    
 > summaryRprof(memory="both")
 $by.self
        self.time self.pct total.time total.pct mem.total
 "<GC>"      4.06      100       4.06       100       3.2

 $by.total
               total.time total.pct mem.total self.time self.pct
 "<GC>"              4.06    100.00       3.2      4.06      100
 "gc"                4.06    100.00       3.2      0.00        0
 "system.time"       4.06    100.00       3.2      0.00        0
 "f"                 3.42     84.24       3.2      0.00        0
 "Reduce"            3.42     84.24       3.2      0.00        0

 $sample.interval
 [1] 0.02

 $sampling.time
 [1] 4.06

 > summaryRprof(memory="tseries")
 Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] :
   non-numeric argument to binary operator
 In addition: Warning message:
 In data.frame(..., check.names = FALSE) :
   row names were found from a short variable and have been discarded
 > summaryRprof(memory="tseries", index=2)
 Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] :
   non-numeric argument to binary operator
 In addition: Warning message:
 In data.frame(..., check.names = FALSE) :
   row names were found from a short variable and have been discarded
 > summaryRprof(memory="both", diff=TRUE)  # diff=TRUE doesn't seem to
matter
 $by.self
        self.time self.pct total.time total.pct mem.total
 "<GC>"      4.06      100       4.06       100       3.2

 $by.total
               total.time total.pct mem.total self.time self.pct
 "<GC>"              4.06    100.00       3.2      4.06      100
 "gc"                4.06    100.00       3.2      0.00        0
 "system.time"       4.06    100.00       3.2      0.00        0
 "f"                 3.42     84.24       3.2      0.00        0
 "Reduce"            3.42     84.24       3.2      0.00        0

 $sample.interval
 [1] 0.02

 $sampling.time
 [1] 4.06

 > summaryRprof(memory="stats", diff=TRUE)
 Error in tapply(seq_len(1L), list(index = c("\"system.time\":\"gc
\"",  :
   arguments must have same length
 > summaryRprof(memory="stats")
 Error in tapply(seq_len(1L), list(index = c("\"system.time\":\"gc
\"",  :
!  arguments must have same length   

Even the memory="both", which at least runs, is not illuminating to me.
I guess it's saying that all the time is going to garbage collection. I
may try your tool to see if that helps.

Thanks.
Ross



More information about the R-help mailing list