[Rd] Detecting whether a process exists or not by its PID?

Tomas Kalibera tom@@@k@liber@ @ending from gm@il@com
Fri Aug 31 15:35:29 CEST 2018


On 08/31/2018 03:13 PM, Gábor Csárdi wrote:
> On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
> [...]
>> kill(sig=0) is specified by POSIX but indeed as you say there is a race
>> condition due to PID-reuse.  In principle, detecting that a worker
>> process is still alive cannot be done correctly outside base R.
> I am not sure why you think so.
To avoid the race with PID re-use one needs access to signal handling, 
to blocking signals, to handling sigchld. system/system2 and 
mcparallel/mccollect in base R use these features and the interaction is 
still safe given the specific use in system/system2 and 
mcparallel/mccollect, yet would have to be re-visited if either of the 
two uses change. These features cannot be safely used outside of base R 
in contributed packages.

Tomas

>
>> At user-level I would probably consider some watchdog, e.g. the parallel
>> tasks would be repeatedly touching a file.
> I am pretty sure that there are simpler and better solutions. E.g. one
> would be to
> ask the worker process for its startup time (with as much precision as possible)
> and then use the (pid, startup_time) pair as a unique id.
>
> With this you can check if the process is still running, by checking
> that the pid exists,
> and that its startup time matches.
>
> This is all very simple with the ps package, on Linux, macOS and Windows.
>
> Gabor
>
>> In base R, one can do this correctly for forked processes via
>> mcparallel/mccollect, not for PSOCK cluster workers which are based on
>> system() (and I understand it would be a useful feature)
>>
>>   > j <- mcparallel(Sys.sleep(1000))
>>   > mccollect(j, wait=FALSE)
>> NULL
>>
>> # kill the child process
>>
>>   > mccollect(j, wait=FALSE)
>> $`1542`
>> NULL
>>
>> More details indeed in ?mcparallel. The key part is that the job must be
>> started as non-detached and as soon as mccollect() collects is,
>> mccollect() must never be called on it again.
>>
>> Tomas
>>
>>> I can the PID of each cluster nodes by querying them for their
>>> Sys.getpid(), e.g.
>>>
>>>       pids <- parallel::clusterEvalQ(cl, Sys.getpid())
>>>
>>> Is there a function in core R for testing whether a process with a
>>> given PID exists or not? From trial'n'error, I found that on Linux:
>>>
>>>     pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))
>>>
>>> returns TRUE for existing processes and FALSE otherwise, but I'm not
>>> sure if I can trust this.  It's not a documented feature in
>>> ?tools::pskill, which also warns about 'signal' not being standardized
>>> across OSes.
>>>
>>> The other Linux alternative I can imagine is:
>>>
>>>     pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
>>> stdout = FALSE) == 0L
>>>
>>> Can I expect this to work on macOS as well?  What about other *nix systems?
>>>
>>> And, finally, what can be done on Windows?
>>>
>>> I'm sure there are packages on CRAN that provides this, but I'd like
>>> to keep dependencies at a minimum.
>>>
>>> I appreciate any feedback. Thxs,
>>>
>>> Henrik
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list