[Rd] Is it possible to increase MAX_NUM_DLLS in future R releases?
Martin Maechler
maechler at stat.math.ethz.ch
Tue May 10 08:57:25 CEST 2016
>>>>> Qin Zhu <qinzhu at outlook.com>
>>>>> on Fri, 6 May 2016 11:33:37 -0400 writes:
> Thanks for all your great answers.
> The app I’m working on is indeed an exploratory data analysis tool for gene expression, which requires a bunch of bioconductor packages.
> I guess for now, my best solution is to divide my app into modules and load/unload packages as the user switch from one module to another.
> This brought me another question: it seems that unload package with the detach/unloadNamespace function does not unload the DLLs, or in the case of the "SCDE" package, not all dependent DLLs:
>> length(getLoadedDLLs())
> [1] 9
>> requireNamespace("scde")
> Loading required namespace: scde
>> length(getLoadedDLLs())
> [1] 34
>> unloadNamespace("scde")
> now dyn.unload("/Library/Frameworks/R.framework/Versions/3.3/Resources/library/scde/libs/scde.so") ...
>> length(getLoadedDLLs())
> [1] 33
> Does that mean I should use dyn.unload to unload whatever I think is associated with that package when the user’s done using it? I’m a little nervous about this because this seems to be OS dependent and previous versions of my app are running on both windows and macs.
Hmm, I thought that dyn.unload() would typically work on all
platforms, but did not research the question now, and am happy
to learn more by being corrected.
Even if we increase MAX_NUM_DLL in the future, a considerable
portion your app's will not use that future version of R yet,
and so you should try to "fight" the problem now.
> Any suggestions would be appreciated, and I’d appreciate if the MAX_NUM_DLLS can be increased.
> Thanks,
> Qin
>> On May 4, 2016, at 9:17 AM, Martin Morgan <martin.morgan at roswellpark.org> wrote:
>>
>>
>>
>> On 05/04/2016 05:15 AM, Prof Brian Ripley wrote:
>>> On 04/05/2016 08:44, Martin Maechler wrote:
>>>>>>>>> Qin Zhu <qinzhu at outlook.com>
>>>>>>>>> on Mon, 2 May 2016 16:19:44 -0400 writes:
>>>>
>>>> > Hi,
>>>> > I’m working on a Shiny app for statistical analysis. I ran into
>>>> this "maximal number of DLLs reached" issue recently because my app
>>>> requires importing many other packages.
>>>>
>>>> > I’ve posted my question on stackoverflow
>>>> (http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached
>>>> <http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached>).
>>>>
>>>>
>>>> > I’m just wondering is there any reason to set the maximal
>>>> number of DLLs to be 100, and is there any plan to increase it/not
>>>> hardcoding it in the future? It seems many people are also running
>>>> into this problem. I know I can work around this problem by modifying
>>>> the source, but since my package is going to be used by other people,
>>>> I don’t think this is a feasible solution.
>>>>
>>>> > Any suggestions would be appreciated. Thanks!
>>>> > Qin
>>>>
>>>> Increasing that number is of course "possible"... but it also
>>>> costs a bit (adding to the fixed memory footprint of R).
>>>
>>> And not only that. At the time this was done (and it was once 50) the
>>> main cost was searching DLLs for symbols. That is still an issue, and
>>> few packages exclude their DLL from symbol search so if symbols have to
>>> searched for a lot of DLLs will be searched. (Registering all the
>>> symbols needed in a package avoids a search, and nowadays by default
>>> searches from a namespace are restricted to that namespace.)
>>>
>>> See
>>> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Registering-native-routines
>>> for some further details about the search mechanism.
>>>
>>>> I did not set that limit, but I'm pretty sure it was also meant
>>>> as reminder for the useR to "clean up" a bit in her / his R
>>>> session, i.e., not load package namespaces unnecessarily. I
>>>> cannot yet imagine that you need > 100 packages | namespaces
>>>> loaded in your R session. OTOH, some packages nowadays have a
>>>> host of dependencies, so I agree that this at least may happen
>>>> accidentally more frequently than in the past.
>>>
>>> I am not convinced that it is needed. The OP says he imports many
>>> packages, and I doubt that more than a few are required at any one time.
>>> Good practice is to load namespaces as required, using requireNamespace.
>>
>> Extensive package dependencies in Bioconductor make it pretty easy to end up with dozen of packages attached or loaded. For instance
>>
>> library(GenomicFeatures)
>> library(DESeq2)
>>
>> > length(loadedNamespaces())
>> [1] 63
>> > length(getLoadedDLLs())
>> [1] 41
>>
>> Qin's use case is a shiny app, presumably trying to provide relatively comprehensive access to a particular domain. Even if the app were to load / requireNamespace() (this requires considerable programming discipline to ensure that the namespace is available on all programming paths where it is used), it doesn't seem at all improbable that the user in an exploratory analysis would end up accessing dozens of packages with orthogonal dependencies. This is also the use case with Karl Forner's post https://stat.ethz.ch/pipermail/r-devel/2015-May/071104.html <https://stat.ethz.ch/pipermail/r-devel/2015-May/071104.html> (adding library(crlmm) to the above gets us to 53 DLLs).
>>
>>>
>>>> The real solution of course would be a code improvement that
>>>> starts with a relatively small number of "DLLinfo" structures
>>>> (say 32), and then allocates more batches (of size say 32) if
>>>> needed.
>>>
>>> The problem of course is that such code will rarely be exercised, and
>>> people have made errors on the boundaries (here multiples of 32) many
>>> times in the past. (Note too that DLLs can be removed as well as added,
>>> another point of coding errors.)
>>
>> That argues for a simple increase in the maximum number of DLLs. This would enable some people to have very bulky applications that pay a performance cost (but the cost here is in small fractions of a second...) in terms of symbol look-up (and collision?), but would have no consequence for those of us with more sane use cases.
I'm seconding Martin Morgan' argument. We could go up to 200.
Computer memory has been increasing a lot, since we set the
limit to 100, and the symbol search performance indeed whould
only be affected for those use cases with (too) many DLLs.
Martin Maechler
>> Martin Morgan
>>
More information about the R-devel
mailing list