[Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c

Steve Bronder sbronder at stevebronder.com
Tue Dec 20 21:08:09 CET 2016


See inlin
​e​


On Tue, Dec 20, 2016 at 12:14 PM, Spencer Graves <
spencer.graves at prodsyse.com> wrote:

> Hi, Dirk:
>
>
>
> On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote:
>
>> On 20 December 2016 at 17:40, Martin Maechler wrote:
>> | >>>>> Steve Bronder <sbronder at stevebronder.com>
>> | >>>>>     on Tue, 20 Dec 2016 01:34:31 -0500 writes:
>> |
>> |     > Thanks Henrik this is very helpful! I will try this out on our
>> tests and
>> |     > see if gcDLLs() has a positive effect.
>> |
>> |     > mlr currently has tests broken down by learner type such as
>> classification,
>> |     > regression, forecasting, clustering, etc.. There are 83
>> classifiers alone
>> |     > so even when loading and unloading across learner types we can
>> still hit
>> |     > the MAX_NUM_DLLS error, meaning we'll have to break them down
>> further (or
>> |     > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff
>> and Bernd
>> |     > Bischl to make sure I am representing the issue well.
>> |
>> | This came up *here* in May 2015
>> | and then May 2016 ... did you not find it when googling.
>
> |
>> | Hint:  Use
>> |        site:stat.ethz.ch MAX_NUM_DLLS
>> | as search string in Google, so it will basically only search the
>> | R mailing list archives
>>
> ​I did not know this and apologize. I starred this email so I can use it
next time I have a question or request. I did find (and left a comment) on
the stackoverflow question in which you left an answer to this question.
http://stackoverflow.com/a/37021455/2269255

> |
>> | Here's the start of that thread :
>> |
>> |
>> ​​
>> ​​
>>  https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html
>> |
>> | There was not a clear conclusion back then, notably as
>> | Prof Brian Ripley noted that 100 had already been an increase
>> | and that a large number of loaded DLLs decreases look up speed.
>
> |
>> | OTOH (I think others have noted that) a large number of DLLs
>> | only penalizes those who *do* load many, and we should probably
>> | increase it.
>>
> ​Am I correct in understanding that the decrease in lookup speed only
happens when a large number of DLLs are loaded? If so, this is an expected
cost to having many DLLs and one that I, and I would guess other
developers, would be willing to pay to have more DLLs available. If
increasing MAX_NUM_DLLS would increase R's fixed memory footprint a
significant amount then I think that's a reasonable argument against the
increase in MAX_NUM_DLLS. ​


> |
>> | Your use case of "hyper packages" which load many others
>> | simultaneously is somewhat convincing to me... in so far as the
>> | general feeling is that memory should be cheap and limits should
>> | not be low.
>>
> ​It should also be pointed out that even in the case of "hyper packages"
like mlr, this is only an issue during unit testing. I wonder if there is
some middle ground here? Would it be difficult to have a compile flag that
would change the number of MAX_NUM_DLLS when compiling R from source? I
believe this would allow us to increase MAX_NUM_DLLS when testing in Travis
and Jenkins while keeping the same footprint for regular users.​


> |
>> | (In spite of Brian Ripleys good reasons against it, I'd still
>> |  aim for a *dynamic*, i.e. automatically increased list here).
>>
>> Yes.  Start with 10 or 20, add 10 as needed.  Still fast in the 'small N'
>> case and no longer a road block for the 'big N' case required by mlr et
>> al.
>>
> ​This would be nice! Though my concern is the R-core team's time. This is
the best answer, but I don't feel comfortable requesting it because I can't
help with this and do not want to take up R-core's time without a very
significant reason.​

​Unit testing for a meta-package is a particular case, though I think an
important one which will impact R over the long term. The answers from
least to most complex are something like:
1. Do nothing
2. Increase MAX_NUM_DLLS
3. Compiler flag for MAX_NUM_DLLS ( I actually have no reference to how
difficult this would be)
4. Change to dynamic loading
I'm requesting (2) because I think it's a simple short term answer until
someone has time to sit down and work out (4).​

>
>> As a C++ programmer, I am now going to hug my
>> ​​
>> std::vector and quietly retreat.
>>
>
>
> May I humbly request a translation of "std::vector" for people like me who
> are not familiar with C++?
>
>
> I got the following:
>
>
> > install.packages('std')
> Warning in install.packages :
>   package ‘std’ is not available (for R version 3.3.2)
>
>
>       Thanks,
>       Spencer Graves
>
>
>> Dirk
>>
>>   | Martin Maechler
>> |
>> |     > Regards,
>> |
>> |     > Steve Bronder
>> |     > Website: stevebronder.com
>> |     > Phone: 412-719-1282
>> |     > Email: sbronder at stevebronder.com
>> |
>> |
>> |     > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson <
>> |     > henrik.bengtsson at gmail.com> wrote:
>> |
>> |     >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because
>> some
>> |     >> packages don't unload their DLLs when they being unloaded
>> themselves.
>> |     >> In other words, there may be left-over DLLs just sitting there
>> doing
>> |     >> nothing but occupying space.  You can remove these, using:
>> |     >>
>> |     >> R.utils::gcDLLs()
>> |     >>
>> |     >> Maybe that will help you get through your tests (as long as
>> you're
>> |     >> unloading packages).  gcDLLs() will look at
>> base::getLoadedDLLs() and
>> |     >> its content and compare to loadedNamespaces() and unregister any
>> |     >> "stray" DLLs that remain after corresponding packages have been
>> |     >> unloaded.
>> |     >>
>> |     >> I think it would be useful if R CMD check would also check that
>> DLLs
>> |     >> are unregistered when a package is unloaded
>> |     >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29),
>> but of
>> |     >> course, someone needs to write the code / a patch for this to
>> happen.
>> |     >>
>> |     >> /Henrik
>> |     >>
>> |     >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder
>> |     >> <sbronder at stevebronder.com> wrote:
>> |     >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in
>> from 100 to
>> |     >> 500.
>> |     >> >
>> |     >> > On line 131 of Rdynload.c, changing
>> |     >> >
>> |     >> > #define MAX_NUM_DLLS 100
>> |     >> >
>> |     >> >  to
>> |     >> >
>> |     >> > #define MAX_NUM_DLLS 500
>> |     >> >
>> |     >> >
>> |     >> > In development of the mlr package, there have been several
>> episodes in
>> |     >> the
>> |     >> > past where we have had to break up unit tests because of the
>> "maximum
>> |     >> > number of DLLs reached" error. This error has been an
>> inconvenience that
>> |     >> is
>> |     >> > going to keep happening as the package continues to grow. Is
>> there more
>> |     >> > than meets the eye with this error or would everything be okay
>> if the
>> |     >> above
>> |     >> > line changes? Would that have a larger effect in other parts
>> of R?
>> |     >> >
>> |     >> > As R grows, we are likely to see more 'meta-packages' such as
>> the
>> |     >> > Hadley-verse, caret, mlr, etc. need an increasing amount of
>> DLLs loaded
>> |     >> at
>> |     >> > any point in time to conduct effective unit tests. If
>> MAX_NUM_DLLS is
>> |     >> set
>> |     >> > to 100 for a very particular reason than I apologize, but if
>> it is
>> |     >> possible
>> |     >> > to increase MAX_NUM_DLLS it would at least make the testing at
>> mlr much
>> |     >> > easier.
>> |     >> >
>> |     >> > I understand you are all very busy and thank you for your time.
>> |     >> >
>> |     >> >
>> |     >> > Regards,
>> |     >> >
>> |     >> > Steve Bronder
>> |     >> > Website: stevebronder.com
>> |     >> > Phone: 412-719-1282
>> |     >> > Email: sbronder at stevebronder.com
>> |     >> >
>> |     >> >         [[alternative HTML version deleted]]
>> |     >> >
>> |     >> > ______________________________________________
>> |     >> > R-devel at r-project.org mailing list
>> |     >> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> |     >>
>> |
>> |     > [[alternative HTML version deleted]]
>> |
>> |     > ______________________________________________
>> |     > R-devel at r-project.org mailing list
>> |     > https://stat.ethz.ch/mailman/listinfo/r-devel
>> |
>> | ______________________________________________
>> | R-devel at r-project.org mailing list
>> | https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-
​ Steve Bronder​

	[[alternative HTML version deleted]]



More information about the R-devel mailing list