[R] Function environments serialize to a lot of data until they don't
Ivan Krylov
|kry|ov @end|ng |rom d|@root@org
Fri Mar 8 13:57:59 CET 2024
Hello R-help,
I've noticed that my 'parallel' jobs take too much memory to store and
transfer to the cluster workers. I've managed to trace it to the
following:
# `payload` is being written to the cluster worker.
# The function FUN had been created as a closure inside my package:
payload$data$args$FUN
# function (l, ...)
# withCallingHandlers(fun(l$x, ...), error = .wraperr(l$name))
# <bytecode: 0x5644a9f08a90>
# <environment: 0x5644aa841ad8>
# The function seems to bring a lot of captured data with it.
e <- environment(payload$data$args$FUN)
length(serialize(e, NULL))
# [1] 738202878
parent.env(e)
# <environment: namespace:mypackage>
# The parent environment has a name, so it all must be right here.
# What is it?
ls(e, all.names = TRUE)
# [1] "fun"
length(serialize(e$fun, NULL))
# [1] 317
# The only object in the environment is small!
# Where is the 700 megabytes of data?
length(serialize(e, NULL))
# [1] 536
length(serialize(payload$data$args$FUN, NULL))
# [1] 1722
And once I've observed `fun`, the environment becomes very small and
now can be serialized in a very compact manner.
I managed to work around it by forcing the promise and explicitly
putting `fun` in a small environment when constructing the closure:
.wrapfun <- function(fun) {
e <- new.env(parent = loadNamespace('mypackage'))
e$fun <- fun
# NOTE: a naive return(function(...)) could serialize to 700
# megabytes due to `fun` seemingly being a promise (?). Once the
# promise is resolved, suddenly `fun` is much more compact.
ret <- function(l, ...) withCallingHandlers(
fun(l$x, ...),
error = .wraperr(l$name)
)
environment(ret) <- e
ret
}
Is this analysis correct? Could a simple f <- force(fun) have sufficed?
Where can I read more about this type of problems?
If this really is due to promises, what would be the downsides of
forcing them during serialization?
--
Best regards,
Ivan
More information about the R-help
mailing list