[Rd] [R-pkg-devel] Run garbage collector when too many open files
luke-tier@ey m@ili@g off uiow@@edu
luke-tier@ey m@ili@g off uiow@@edu
Tue Aug 7 17:07:28 CEST 2018
In R 3.5 and later you should not need to gc() -- that should happen
automatically within the connections code.
Nevertheless, I would recommend redesigning your approach to avoid
hanging onto open file connections as these are a scarce resource.
You can keep around your temporary files without having them open and
only open/close them on access, with the close run in an on.exit or a
tryCatch/finally clause.
Best,
luke
On Tue, 7 Aug 2018, Jan van der Laan wrote:
> Dear Uwe,
>
> (When replying to your message, I sent the reply to r-devel and not
> r-package-devel, as Martin Meachler suggested that this thread would be a
> better fit for r-devel.)
>
> Thanks. In the example below I used rm() explicitly, but in general users
> wouldn't do that.
>
> One of the reasons for the large number of file handles is that sometimes
> unnamed temporary objects are created. For example:
>
>> library(ldat)
>> libraty(lvec)
>>
>> a <- lvec(10, "integer")
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214753f2af0'
>> b <- as_rvec(a[1:3])
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec32146a50f383'
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214484b652c'
>> print(b)
> [1] 0 0 0
>>
>>
>> gc()
> CLOSEFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214484b652c'
> CLOSEFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec32146a50f383'
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 796936 42.6 1442291 77.1 1168576 62.5
> Vcells 1519523 11.6 4356532 33.3 4740854 36.2
>
>
> For debugging, I log when files are opened and closed. The call a[1:3] (which
> creates a slice of a) creates two temporary objects [1]. These are only
> deleted when I explicitly call gc() or on some other random moment in time.
>
> I hope this illustrates the problem better.
>
>
> Best,
> Jan
>
>
> [1] One improvement would be to create less temporary files; often these
> contain only very little information that is better kept in memory. But that
> is only a partial solution.
>
>
>
>
> On 07-08-18 15:24, Uwe Ligges wrote:
>> Why not add functionality that allows to delete object + runs cleanup code?
>>
>> Best,
>> Uwe Ligges
>>
>>
>>
>> On 07.08.2018 14:26, Jan van der Laan wrote:
>>>
>>>
>>> In my package I open handles to temporary files from c++, handles to them
>>> are returned to R through vptr objects. The files are deleted then the
>>> corresponding R-object is deleted and the garbage collector runs:
>>>
>>> a <- lvec(10, "integer")
>>> rm(a)
>>>
>>> Then when the garbage collector runs the file is deleted. However, on some
>>> platforms (probably with lower limits on the maximum number of file
>>> handles a process can have open), I run into the problem that the garbage
>>> collector doesn't run often enough. In this case that means that another
>>> package of mine using this package generates an error when its tests are
>>> run.
>>>
>>> The simplest solution is to add some calls to gc() in my tests. But a more
>>> general/automatic solution would be nice.
>>>
>>> I thought about something in the lines of
>>>
>>> robust_lvec <- function(...) {
>>> tryCatch({
>>> lvec(...)
>>> }, error = function(e) {
>>> gc()
>>> lvec(...) # duplicated code
>>> })
>>> }
>>>
>>> e.g. try to open a file, when that fails call the garbage collector and
>>> try again. However, this introduces duplicated code (in this case only one
>>> line, but that can be more), and doesn't help if it is another function
>>> that tries to open a file.
>>>
>>> Is there a better solution?
>>>
>>> Thanks!
>>>
>>> Jan
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke-tierney using uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-devel
mailing list