[Rd] [External] Re: .Internal(quit(...)): system call failed: Cannot allocate memory

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Wed Nov 25 21:38:54 CET 2020


On Tue, 24 Nov 2020, Jan Gorecki wrote:

> As for other calls to system. I avoid calling system. In the past I
> had some (to get memory stats from OS), but they were failing with
> exactly the same issue. So yes, if I would add call to system before
> calling quit, I believe it would fail with the same error.
> At the same time I think (although I am not sure) that new allocations
> made in R are working fine. So R seems to reserve some memory and can
> continue to operate, while external call like system will fail. Maybe
> it is like this by design, don't know.

Thanks for the report on quit(). We're exploring how to make the
cleanup on exit more robust to low memory situations like these.

>
> Aside from this problem that is easy to report due to the warning
> message, I think that gc() is choking at the same time. I tried to
> make reproducible example for that, multiple times but couldn't, let
> me try one more time.
> It happens to manifest when there is 4e8+ unique characters/factors in
> an R session. I am able to reproduce it using data.table and dplyr
> (0.8.4 because 1.0.0+ fails even sooner), but using base R is not easy
> because of the size. I described briefly problem in:
> https://github.com/h2oai/db-benchmark/issues/110

Because of the design of R's character vectors, with each element
allocated separately, R is never going to be great at handling huge
numbers of distinct strings. But it can do an adequate job given
enough memory to work with.

When I run your GitHub issue example on a machine with around 500 Gb
of RAM it seems to run OK; /usr/bin/time reports

2706.89user 161.89system 37:10.65elapsed 128%CPU (0avgtext+0avgdata 92180796maxresident)k
0inputs+103450552outputs (0major+38716351minor)pagefaults 0swaps

So the memory footprint is quite large. Using gc.time() it looks like
about 1/3 of the time is in GC. Not ideal, and maybe could be improved
on a bit, but probably not by much. The GC is basically doing an
adequate job, given enough RAM.

If you run this example on a system without enough RAM, or with other
programs competing for RAM, you are likely to end up fighting with
your OS/hardware's virtual memory system. When I try to run it on a
16Gb system it churns for an hour or so before getting killed, and
/usr/bin/time reports a huge number of page faults:

312523816inputs+0outputs (24761285major+25762068minor)pagefaults 0swaps

You are probably experiencing something similar.

There may be opportunities for more tuning of the GC to better handle
running this close to memory limits, but I doubt the payoff would be
worth the effort.

Best,

luke

> It would help if gcinfo() could take FALSE/TRUE/2L where 2L will print
> even more information about gc, like how much time the each gc()
> process took, how many objects it has to check on each level.
>
> Best regards,
> Jan
>
>
>
> On Tue, Nov 24, 2020 at 1:05 PM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>>
>> On 11/24/20 11:27 AM, Jan Gorecki wrote:
>>> Thanks Bill for checking that.
>>> It was my impression that warnings are raised from some internal
>>> system calls made when quitting R. At that point I don't have much
>>> control over checking the return status of those.
>>> Your suggestion looks good to me.
>>>
>>> Tomas, do you think this could help? could this be implemented?
>>
>> I think this is a good suggestion. Deleting files on Unix was changed
>> from system("rm") to doing that in C, and deleting the session directory
>> should follow.
>>
>> It might also help diagnosing your problem, but I don't think it would
>> solve it. If the diagnostics in R works fine and the OS was so
>> hopelessly out of memory that it couldn't run any more external
>> processes, then really this is not a problem of R, but of having
>> exhausted the resources. And it would be a coincidence that just this
>> particular call to "system" at the end of the session did not work.
>> Anything else could break as well close to the end of the script. This
>> seems the most likely explanation to me.
>>
>> Do you get this warning repeatedly, reproducibly at least in slightly
>> different scripts at the very end, with this warning always from quit()?
>> So that the "call" part of the warning message has .Internal(quit) like
>> in the case you posted? Would adding another call to "system" before the
>> call to "q()" work - with checking the return value? If it is always
>> only the last call to "system" in "q()", then it is suspicious, perhaps
>> an indication that some diagnostics in R is not correct. In that case, a
>> reproducible example would be the key - so either if you could diagnose
>> on your end what is the problem, or create a reproducible example that
>> someone else can use to reproduce and debug.
>>
>> Best
>> Tomas
>>
>>>
>>> On Mon, Nov 23, 2020 at 7:10 PM Bill Dunlap <williamwdunlap using gmail.com> wrote:
>>>> The call to system() probably is an internal call used to delete the session's tempdir().  This sort of failure means that a potentially large amount of disk space is not being recovered when R is done.  Perhaps R_CleanTempDir() could call R_unlink() instead of having a subprocess call 'rm -rf ...'.  Then it could also issue a specific warning if it was impossible to delete all of tempdir().  (That should be very rare.)
>>>>
>>>>> q("no")
>>>> Breakpoint 1, R_system (command=command using entry=0x7fffffffa1e0 "rm -Rf /tmp/RtmppoKPXb") at sysutils.c:311
>>>> 311     {
>>>> (gdb) where
>>>> #0  R_system (command=command using entry=0x7fffffffa1e0 "rm -Rf /tmp/RtmppoKPXb") at sysutils.c:311
>>>> #1  0x00005555557c30ec in R_CleanTempDir () at sys-std.c:1178
>>>> #2  0x00005555557c31d7 in Rstd_CleanUp (saveact=<optimized out>, status=0, runLast=<optimized out>) at sys-std.c:1243
>>>> #3  0x00005555557c593d in R_CleanUp (saveact=saveact using entry=SA_NOSAVE, status=status using entry=0, runLast=<optimized out>) at system.c:87
>>>> #4  0x00005555556cc85e in do_quit (call=<optimized out>, op=<optimized out>, args=0x555557813f90, rho=<optimized out>) at main.c:1393
>>>>
>>>> -Bill
>>>>
>>>> On Mon, Nov 23, 2020 at 3:15 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>>>>> On 11/21/20 6:51 PM, Jan Gorecki wrote:
>>>>>> Dear R-developers,
>>>>>>
>>>>>> Some of the more fat scripts (50+ GB mem used by R) that I am running,
>>>>>> when they finish they do quit with q("no", status=0)
>>>>>> Quite often it happens that there is an extra stderr output produced
>>>>>> at the very end which looks like this:
>>>>>>
>>>>>> Warning message:
>>>>>> In .Internal(quit(save, status, runLast)) :
>>>>>>     system call failed: Cannot allocate memory
>>>>>>
>>>>>> Is there any way to avoid this kind of warnings? I am using stderr
>>>>>> output for detecting failures in scripts and this warning is a false
>>>>>> positive of a failure.
>>>>>>
>>>>>> Maybe quit function could wait little bit longer trying to allocate
>>>>>> before it raises this warning?
>>>>> If you see this warning, some call to system() or system2() or similar,
>>>>> which executes an external program, failed to even run a shell to run
>>>>> that external program, because there was not enough memory. You should
>>>>> be able to find out where it happens by checking the exit status of
>>>>> system().
>>>>>
>>>>> Tomas
>>>>>
>>>>>
>>>>>> Best regards,
>>>>>> Jan Gorecki
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel using r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>> ______________________________________________
>>>>> R-devel using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list