[Rd] save() and interrupts
Luke Tierney
luke at stat.uiowa.edu
Tue Apr 17 14:48:30 CEST 2007
On Mon, 16 Apr 2007, Henrik Bengtsson wrote:
> On 4/16/07, Luke Tierney <luke at stat.uiowa.edu> wrote:
>> On Mon, 16 Apr 2007, Bill Dunlap wrote:
>>
>> > On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
>> >
>> >> On 4/15/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>> >>> On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
>> >>>
>> >>>> are there any (cross-platform) specs on what the saved filed is if
>> >>>> save() is interrupted, e.g. by a user interrupt? It could be
>> >>>> non-existing, empty, partly written, or completed.
>> >>>
>> >>> My understanding is that you cannot user interrupt compiled code unless
>> it
>> >>> is set up to check interrupts. Version 2 saves are done via the
>> internal
>> >>> saveToConn, and I don't see any calls to R_CheckUserInterrupt there. So
>> >>> you only need to worry about user interrupts in the R code, and that
>> has
>> >>> an on.exit action to close the connection (which should be executed
>> even
>> >>> if you interrupt). Which suggests that the file will be
>> >>>
>> >>> non-existent
>> >>> empty
>> >>> complete
>> >>>
>> >>> and the first two depend on interrupting in the millisecond or less
>> before
>> >>> the compiled code gets called.
>> >>
>> >> I'll put it on my todo list to investigate how to make save() more
>> >> robust against interrupts before calling the internal code. One
>> >> option is to use tryCatch(). However, that does not handle too
>> >> frequent user interrupts, e.g. if an interrupt is sent while in the
>> >> "interrupt" call, that will interrupt the function. So, tryCatch()
>> >> alone will only lower the risk for incomplete empty files. For data
>> >> written to files, one alternative is to check for files of zero size
>> >> in the on.exit() statement and remove such.
>> >>
>> >> /Henrik
>> >>>
>> >>> For other forms of interrupts, e.g. a Unix kill -9, the file state
>> could
>> >>> be anything.
>> >>>
>> >>> Brian D. Ripley, ripley at stats.ox.ac.uk
>> >>> ...
>> >
>> > You could change the code to write to a temporary
>> > file (in the directory you want the result in) and
>> > when you successfully finish writing to the file
>> > you rename it to the permanent name. (On an interrupt
>> > you remove the temp file, and on 'kill -9' the only
>> > bad effect is the space used by the partially written
>> > temp file.) This has the added advantage that you don't
>> > overwrite an existing save file by the given name until
>> > you know a suitable replacement is ready.
>> >
>> > Perhaps we need a connection type that encapsulates this.
>> >
>> >
>> ----------------------------------------------------------------------------
>> > Bill Dunlap
>> > Insightful Corporation
>> > bill at insightful dot com
>> > 360-428-8146
>> >
>> > "All statements in this message represent the opinions of the author and
>> do
>> > not necessarily reflect Insightful Corporation policy or position."
>>
>> We do this with save.image. Since save is a little more general it is
>> a bit less obvious what the right way to do this sort of thing is, or
>> whether there is a single right way. I think if I was concerned about
>> this I would write something around the current save for particular
>> kinds of connections rather than changing save itself. The main
>> reason for taking a different rout with save.image is that that gets
>> called implicitly by q().
>>
>> [our current ability to manage user interrupts is not ideal--hopefully
>> we can make a bit of progress on this soon.]
>
> I was thinking about this last night: It would be useful to have a
> feature/construct to evaluate an R expression atomically where user
> interrupts will *not have an affect until afterwards*, cf. calls to
> native code. This would solve the problem of getting interrupts while
> in a tryCatch(..., interrupt=..., finally=...). Of course this
> requires caution by the programmer, but it is also unlikely to be used
> by someone who do not know what the risks are. I do not know the
> different signals available, but one could consider such atomic calls
> to be protected against different levels of signals. In addition, one
> could have an optional threshold of the number of interrupt signals it
> takes to (even) interrupt an atomic evaluation.
This is the sort of thing I have been tinking about. One also needs to
enable interrupts within selected parts of such a construct, and these
things need to cooperate with each other and with internal code. There
is a paper on doing these sorts of things in a principled way in
Haskell that I want to spend some time reading to see what translates
to us.
Best,
luke
--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke at stat.uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-devel
mailing list