[R] gc() vs memory.profile()

Hadley Wickham h.wickham at gmail.com
Fri Dec 27 23:47:59 CET 2013


Hi Ross,

It's not obvious how useful memory.profile() is here. I created the
following little experiment to help me understand what
memory.profile() is showing (and to make it easier to see the
changes), but it's left me more confused than enlightened:

m_delta <- function(expr) {
  # Evaluate in clean environment to limit effects
  e <- new.env(parent = parent.frame())
  # Force gc to flush any values no longer attached to names
  gc()
  old <- memory.profile()

  eval(substitute(expr), env = e)

  gc()
  new <- memory.profile()

  report <- cbind(old, new, delta = new - old)
  # Only show rows where something changed
  report[report[, 3] != 0, ]
}

# Why does this create 3 pairlists, 1 integer and 1 character,
# but no doubles?
m_delta(x <- 1.5)

# No different
m_delta({x <- 1.5})
# Only creates an extra pairlist compared to the previous case
m_delta({x <- 1.5; y <- 2.5})

# Creates 2 pairlists, 1 integer and 1 character even the code
# should have no lasting impact
m_delta(1)

For your original case, you may find it more useful to do memory +
line profiling (e.g. as visualised by
https://github.com/hadley/lineprof) to figure out what's going on.

Hadley

On Fri, Dec 27, 2013 at 1:49 PM, Ross Boylan <ross at biostat.ucsf.edu> wrote:
> I am trying to understand why a function causes my memory use to
> explode.  While doing that I noticed that my memory use as reported by
> gc() is growing, but the results of memory.profile() are almost
> unchanged (the count for raw grew by 3).  How can the two functions
> produce different results, and what does it mean?
>
>  >   system.time(r <- Reduce(addResults, bigr[1:10]))
>     user  system elapsed
>    3.437   0.000   3.444
>  >   gc()
>              used   (Mb) gc trigger    (Mb)   max used    (Mb)
>  Ncells   2994756  160.0    4418719   236.0    3587436   191.6
>  Vcells 797226672 6082.4 2470056017 18845.1 7340162895 56001.0
>  > memory.profile()
>         NULL      symbol    pairlist     closure environment     promise
>            1       13324     1193588       34602        3000        9754
>     language     special     builtin        char     logical     integer
>       305689          44         637       52237       53954      141713
>       double     complex   character         ...         any        list
>       548076          38      426187          39           0      116692
>   expression    bytecode externalptr     weakref         raw          S4
>            1       65604        4195         708         735       23935
>  >   system.time(r <- Reduce(addResults, bigr[1:20]))
>     user  system elapsed
>    8.432   0.164  10.315   # suspiciously higher than 2* time for 10
>  >   gc()
>              used   (Mb) gc trigger    (Mb)   max used    (Mb)
>  Ncells   2994759  160.0    4418719   236.0    3587436   191.6
> !Vcells 828653314 6322.2 2470056017 18845.1 7340162895 56001.0
>  > memory.profile()
>         NULL      symbol    pairlist     closure environment     promise
>            1       13324     1193588       34602        3000        9754
>     language     special     builtin        char     logical     integer
>       305689          44         637       52237       53954      141713
>       double     complex   character         ...         any        list
>       548076          38      426187          39           0      116692
>   expression    bytecode externalptr     weakref         raw          S4
>            1       65604        4195         708         738       23935
> # by eye, the only change is raw, from 735 to 738.
>
> R 3.0.1 running on Debian GNU/Linux squeeze.
>
> Ross Boylan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://had.co.nz/



More information about the R-help mailing list