[Rd] save() and interrupts

Mon Apr 16 21:23:15 CEST 2007

On Mon, 16 Apr 2007, Bill Dunlap wrote:

> On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
>
>> On 4/15/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>>> On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
>>>
>>>> are there any (cross-platform) specs on what the saved filed is if
>>>> save() is interrupted, e.g. by a user interrupt?   It could be
>>>> non-existing, empty, partly written, or completed.
>>>
>>> My understanding is that you cannot user interrupt compiled code unless it
>>> is set up to check interrupts.  Version 2 saves are done via the internal
>>> saveToConn, and I don't see any calls to R_CheckUserInterrupt there. So
>>> you only need to worry about user interrupts in the R code, and that has
>>> an on.exit action to close the connection (which should be executed even
>>> if you interrupt).  Which suggests that the file will be
>>>
>>> non-existent
>>> empty
>>> complete
>>>
>>> and the first two depend on interrupting in the millisecond or less before
>>> the compiled code gets called.
>>
>> I'll put it on my todo list to investigate how to make save() more
>> robust against interrupts before calling the internal code.  One
>> option is to use tryCatch().  However, that does not handle too
>> frequent user interrupts, e.g. if an interrupt is sent while in the
>> "interrupt" call, that will interrupt the function.  So, tryCatch()
>> alone will only lower the risk for incomplete empty files.  For data
>> written to files, one alternative is to check for files of zero size
>> in the on.exit() statement and remove such.
>>
>> /Henrik
>>>
>>> For other forms of interrupts, e.g. a Unix kill -9, the file state could
>>> be anything.
>>>
>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>> ...
>
> You could change the code to write to a temporary
> file (in the directory you want the result in) and
> when you successfully finish writing to the file
> you rename it to the permanent name.  (On an interrupt
> you remove the temp file, and on 'kill -9' the only
> bad effect is the space used by the partially written
> temp file.)  This has the added advantage that you don't
> overwrite an existing save file by the given name until
> you know a suitable replacement is ready.
>
> Perhaps we need a connection type that encapsulates this.
>
> ----------------------------------------------------------------------------
> Bill Dunlap
> Insightful Corporation
> bill at insightful dot com
> 360-428-8146
>
> "All statements in this message represent the opinions of the author and do
> not necessarily reflect Insightful Corporation policy or position."

We do this with save.image.  Since save is a little more general it is
a bit less obvious what the right way to do this sort of thing is, or
whether there is a single right way.  I think if I was concerned about
this I would write something around the current save for particular
kinds of connections rather than changing save itself.  The main
reason for taking a different rout with save.image is that that gets
called implicitly by q().

[our current ability to manage user interrupts is not ideal--hopefully
we can make a bit of progress on this soon.]

Best,

luke

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu