[R] checkpointing

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Tue Dec 14 01:06:17 CET 2021


I am wondering what it even means to do what you say. In a compiled
language, I can imagine wrapping up an executable along with some kind of
run-time image (which may actually contain the parts of the executable that
includes what has not run yet) and revive it elsewhere.

But even there, how would it work if say the executable kept opening more
files for reading or appending and you move it to place those files did not
exists or had different contents or other such scenarios? What happens to
open pipes with another process attached, for an OS that supports pipes?
When you restart, the other processes re not there even if you supply an
image of a pipe and I am sure others can imagine much more.

R is interpreted. You could say the main interpreter may be like an
executable and there may be multiple threads active at the time you stop the
process and bundle it to be restarted later. But R has many fairly dynamic
features including some the interpreter has not even looked at yet. Besides
files it may want to open, there are any number of statements like
library(filename) it may encounter and of course other files it may
source(code) . In general, the info on what may be needed later is not in
any serious way bundled with the file and many things may be hard to predict
even with a look ahead as often arguments to functions are not evaluated
till some indefinite later time or even never. 

I am trying to imagine how you stop and restore say an R program running
connected to something like RSTUDIO which is also connected to a Python
program with data and instructions flowing back and forth.

It does not strike me as easy to make a reliable method to do this, albeit
as noted, there are operating systems that do allow you to suspend arbitrary
processes and restart them locally perhaps only before the system reboots.

But I can think of exceptions, including some I see others have thought of.
An example might be an R program that reads in lots of data, then makes
objects like data.frames and then pauses in some kind of nested loop that
will process the data while having the current indices saved in variables.
It could literally ask to be frozen so it starts up from there when asked
to. R can be set to intercept some signals and perhaps voluntarily save all
the variables as they are (including the data it may be operating on and
what it is making from it (as in what search items it has already found) as
well as the needed index info) and exit gracefully. If the application is
restarted, it might note the file with saved info and read in all the data
and continue from there. The above is not a serious proposal and has lots of
things that can go wrong, but I can imagine it as an app that sets itself up
doing heavy lifting once and later every time you want to do a search, it
loads the data and gets from you something to search for and does it quickly
and resuspends till needed. But this example is not exactly what you asked

I have actually done weird things like the above including things that
simply start up again after a reboot as if nothing happened. 

What is a more interesting question for me is what R features might make
sense that help construct a program that is in some sense re-startable if
used right. I can imagine a package that lets you set things like a "level"
for debugging so that your code when started at some point says:

# initialize.
# load any left-in-file data if it exists.

if (level < 2) {
  do stuff
  level <- 2

if (level < 3) {
  do more stuff
  level <- 3


Something like the above might wrap parts in something like a "try()" that
intercepts some interrupt condition and saves the needed status info.

What I wonder is if long-running processes that can be up for months say in
a web-server, may already have ways to save all kinds of status info so when
they start up again after a normal reboot, are able to continue almost as if
nothing had happened.

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Jeff Newmiller
Sent: Monday, December 13, 2021 11:54 AM
To: Andy Jacobson <andy.jacobson using noaa.gov>; Andy Jacobson via R-help
<r-help using r-project.org>; r-help using r-project.org
Subject: Re: [R] checkpointing

This sounds like an OS feature, not an R feature... certainly not a portable
R feature.

On December 13, 2021 8:37:30 AM PST, Andy Jacobson via R-help
<r-help using r-project.org> wrote:
>Has anyone ever considered what it would take to implement checkpointing in
R, so that long-running processes could be interrupted and resumed later,
from a different process or even a different machine?

Sent from my phone. Please excuse my brevity.

R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list