[Rd] tempdir() may be deleted during long-running R session
Duncan Murdoch
murdoch.duncan at gmail.com
Wed Apr 26 19:47:16 CEST 2017
On 26/04/2017 10:39 AM, Tomas Kalibera wrote:
>
> I agree this should be solved in configuration of
> systemd/tmpreaper/whatever tmp cleaner - the cleanup must be prevented
> in configuration files of these tools. Moving session directories under
> /var/run (XDG_RUNTIME_DIR) does not seem like a good solution to me,
> sooner or later someone might come with auto-cleaning that directory too.
>
> It might still be useful if R could sometimes detect when automated
> cleanup happened and warn the user. Perhaps a simple way could be to
> always create an empty file inside session directory, like
> ".tmp_cleaner_trap". R would never touch this file, but check its
> existence time-to-time. If it gets deleted, R would issue a warning and
> ask the user to check tmp cleaner configuration. The idea is that this
> file will be the oldest one in the session directory, so would get
> cleaned up first.
Yes, I like that idea, as long as checking for its existence doesn't
make some system think it is in use and therefore protected from deletion.
Duncan Murdoch
>
> Tomas
>
>
> On 04/26/2017 02:29 PM, Duncan Murdoch wrote:
>> On 26/04/2017 4:21 AM, Martin Maechler wrote:
>>>>>>>> <frederik at ofb.net>
>>>>>>>> on Tue, 25 Apr 2017 21:13:59 -0700 writes:
>>>
>>> > On Tue, Apr 25, 2017 at 02:41:58PM +0000, Cook, Malcolm wrote:
>>> >> Might this combination serve the purpose:
>>> >> * R session keeps an open handle on the tempdir it creates,
>>> >> * whatever tempdir harvesting cron job the user has be made
>>> sensitive enough not to delete open files (including open directories)
>>>
>>> I also agree that the above would be ideal - if possible.
>>>
>>> > Good suggestion but doesn't work with the (increasingly popular)
>>> > "Systemd":
>>>
>>> > $ mkdir /tmp/somedir
>>> > $ touch -d "12 days ago" /tmp/somedir/
>>> > $ cd /tmp/somedir/
>>> > $ sudo systemd-tmpfiles --clean
>>> > $ ls /tmp/somedir/
>>> > ls: cannot access '/tmp/somedir/': No such file or directory
>>>
>>> Some thing like your example is what I'd expect is always a
>>> possibility on some platforms, all of course depending on low
>>> things such as root/syadmin/... "permission" to clean up etc.
>>>
>>> Jeroeen mentioned the fact that tempdir()s also can disappear
>>> for other reasons {his was multicore child processes
>>> .. bugously(?) implemented}.
>>> Further reasons may be race conditions / user code bugs / user
>>> errors, etc.
>>> Note that the R process which created the tempdir on startup
>>> always has the permission to remove it again. But you can also
>>> think a full file system, etc.
>>>
>>> Current R-devel's tempdir(check = TRUE) would create a new
>>> one or give an error (and then the user should be able to use
>>> Sys.setenv("TEMPDIR" ...)
>>> to a directory she has write-permission )
>>>
>>> Gabe's point of course is important too: If you have a long
>>> running process that uses a tempfile,
>>> and if "big brother" has removed the full tempdir() you will
>>> be "unhappy" in any case.
>>> Trying to prevent big brother from doing that in all cases seems
>>> "not easy" in any case.
>>>
>>> I did want to provide an easy solution to the OP situation:
>>> Suddenly tmpdir() is gone, and quite a few things stop working
>>> in the current R process {he mentioned help(), e.g.}.
>>> With new tmpdir(check=TRUE) facility, code could be changed
>>> to replace
>>>
>>> tempfile("foo")
>>>
>>> either by
>>> tempfile("foo", tmpdir=tempdir(check=TRUE))
>>>
>>> or by something like
>>>
>>> tryCatch(tempfile("foo"),
>>> error = function(e)
>>> tempfile("foo", tmpdir=tempdir(check=TRUE)))
>>>
>>> or be even more sophisticated.
>>>
>>> We could also consider allowing check = TRUE | NA | FALSE
>>>
>>> and make NA the default and have that correspond to
>>> check =TRUE but additionally do the equivalent of
>>> warning("tempdir() has become invalid and been recreated")
>>> in case the tempdir() had been invalid.
>>>
>>> > I would advocate just changing 'tempfile()' so that it
>>> recreates the
>>> > directory where the file is (the "dirname") before returning
>>> the file
>>> > path. This would have fixed the issue I ran into. Changing
>>> 'tempdir()'
>>> > to recreate the directory is another option.
>>>
>>> In the end I had decided that
>>>
>>> tempfile("foo", tmpdir = tempdir(check = TRUE))
>>>
>>> is actually better self-documenting than
>>>
>>> tempfile("foo", checkDir = TRUE)
>>>
>>> which was my first inclination.
>>>
>>> Note again that currently, the checking is _off_ by default.
>>> I've just provided a tool -- which was relatively easy and
>>> platform independent! --- to do more (real and thought)
>>> experiments.
>>
>> This seems like the wrong approach. The problem occurs as soon as the
>> tempdir() gets cleaned up: there could be information in temp files
>> that gets lost at that point. So the solution should be to prevent
>> the cleanup, not to continue on after it has occurred (as "check =
>> TRUE" does). This follows the principle that it's better for the
>> process to always die than to sometimes silently produce incorrect
>> results.
>>
>> Frederick posted the way to do this in systems using systemd. We
>> should be putting that in place, or the equivalent on systems using
>> other tempfile cleanups. This looks to me like something that "make
>> install" should do, or perhaps it should be done by people putting
>> together packages for specific systems.
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
More information about the R-devel
mailing list