[Rd] system/system2 and open file descriptors

William Dunlap wdunlap at tibco.com
Thu Apr 20 16:30:07 CEST 2017


In S+ on Unix-alikes we dealt with this issue by using fcntl(fd,
F_SETFD, 1) to set the close-on-exec flag on a file descriptor as soon
as we opened it.
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Apr 19, 2017 at 8:40 PM, Winston Chang <winstonchang1 at gmail.com> wrote:
> In addition to the issue of a child process holding onto open files, the
> child process can also manipulate a file descriptor in a way that affects
> the parent process. For example, calling lseek() in the child process will
> move the file offset in the parent process.
>
> Here is a set of commands that demonstrates it. They can be copied and
> pasted in a terminal. What it does:
> - Creates C program that seeks to the beginning of a file descriptor, and
> compiles it to a program named "lseek".
> - Creates a file with some text in it.
> - Starts R. In R:
>     - Opens the text file and reads the first line.
>     - Runs lseek in a child process.
>     - Reads the rest of the lines.
>
>
> echo "#include <unistd.h>
> int main(void) {
>   lseek(3, 0, SEEK_SET);
> }" > lseek.c
>
> gcc lseek.c -o lseek
>
> echo "line 1
> line 2
> line 3" > lines.txt
>
> R
> f <- file('lines.txt', 'r')
> cat(readLines(f, n = 1), sep = "\n")
> system('./lseek')
> cat(readLines(f), sep = "\n")
>
>
> Here's what it outputs:
>> f <- file('lines.txt', 'r')
>> cat(readLines(f, n = 1), sep = "\n")
> line 1
>> system('./lseek')
>> cat(readLines(f), sep = "\n")
> line 2
> line 3
> line 1
> line 2
> line 3
>
> The child process has changed what the parent process reads from the file.
> (I'm guessing that the reason readLines() prints out "line 2" and "line 3"
> before starting over is because it has already buffered the whole file
> before lseek is executed.)
>
> This is obviously a highly contrived case, but it illustrates what's
> possible. The other issue I mentioned, with child processes holding open
> files after the R process exits, is more likely to cause problems in the
> real world. That's actually how I encountered this issue in the first
> place: when restarting R inside of RStudio on a Mac, if there are any
> extant child processes started by system(), they keep some files open, and
> this causes RStudio to hang. (There's a fix in progress for RStudio for
> this particular issue.)
>
> -Winston
>
>
>
> On Tue, Apr 18, 2017 at 3:20 PM, Winston Chang <winstonchang1 at gmail.com>
> wrote:
>
>> It seems that the system() and system2() functions don't close file
>> descriptors between the fork() and exec() (on Unix platforms, of course).
>> This means that the child processes inherit open files and socket
>> connections.
>>
>> Running this (from a terminal) will result in the child process writing to
>> a file that was opened by R:
>>
>> R
>> f <- file('foo.txt', 'w')
>> system('echo "abc" >&3')
>>
>>
>>
>> You can also see the open files if you run the following:
>>   f <- file('foo.txt', 'w')
>>   system2('sleep', '100', wait=F)
>>
>> And then in another terminal:
>>   lsof -c R -c sleep
>> it will show that both the R and sleep processes have the file open:
>>   ...
>>   R       324 root    3w   REG   0,48        0   4259 /foo.txt
>>   ...
>>   sleep   327 root    3w   REG   0,48        0   4259 /foo.txt
>>
>>
>> This behavior can cause problems if R spawns a child process that outlives
>> the R process, but keeps open some resources.
>>
>> Would it be possible to add an option to close file descriptors for child
>> processes? It would be nice if that were the default, but I suspect that
>> making that change would break a lot of existing code.
>>
>> To take an example from the Python world, subprocess.Popen() has an
>> option, close_fds, which closes all file descriptors except 0, 1, and 2.
>>   https://docs.python.org/2/library/subprocess.html#popen-constructor
>>
>>
>> -Winston
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list