[Rd] Recycling memory with a small free list

Nathan Kurz nate at verse.com
Tue Feb 17 21:51:00 CET 2015


I'm trying to improve the performance of the update loop within a
logistic regression, and am struggling against the overhead of memory
allocation and garbage collection.   The main issue I'd like to solve
is with assignments inside of loops like this:

reweight = function(iter, w, Q) {
  for (i in 1:iter) {
    wT = w * Q
  }
}

If the matrix Q is large I can get a significant gain in performance
if I can reuse the same allocation for wT rather than making a new
allocation on each loop iteration.

While I can solve this problem in this case with an Rcpp function that
does the modifications in place, this seems like a case where a more
general solution might work well.   While normally we don't know
whether an allocation has gone out of scope, the left-hand-side of an
assignment is a frequent exception.  Instead of simply letting the LHS
become unreferenced, has the possibility of adding a small free-list
been considered?

That is, before the RHS is executed, the LHS allocation would be added
to a small fixed length list of available space which is checked
before future allocations.   If the same size is requested before the
next garbage collection, the allocation is short-circuited and the
allocation is reused.   This list could be very small, possibly even
only a single entry.  Entries would only be put on the list if they
have no other references.   If a garbage collection is triggered, the
list would be emptied and the contents collected.

While my understanding of R's memory strategy is still limited, it
looks like this would be a reasonably straightforward patch that could
touch only the assignment operator, the vector allocator, and the
collection routine.   Overall memory usage should go down, and in the
cases I'm considering, the performance improvement would be large.
Measurement details are here:
http://stackoverflow.com/questions/28532493/reusing-existing-memory-for-blas-operations-in-r

Are there downsides or difficulties to this approach that I'm missing?

Nathan Kurz
nate at verse.com



More information about the R-devel mailing list