[Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
Steve Bronder
sbronder at stevebronder.com
Tue Dec 20 21:08:09 CET 2016
See inlin
e
On Tue, Dec 20, 2016 at 12:14 PM, Spencer Graves <
spencer.graves at prodsyse.com> wrote:
> Hi, Dirk:
>
>
>
> On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote:
>
>> On 20 December 2016 at 17:40, Martin Maechler wrote:
>> | >>>>> Steve Bronder <sbronder at stevebronder.com>
>> | >>>>> on Tue, 20 Dec 2016 01:34:31 -0500 writes:
>> |
>> | > Thanks Henrik this is very helpful! I will try this out on our
>> tests and
>> | > see if gcDLLs() has a positive effect.
>> |
>> | > mlr currently has tests broken down by learner type such as
>> classification,
>> | > regression, forecasting, clustering, etc.. There are 83
>> classifiers alone
>> | > so even when loading and unloading across learner types we can
>> still hit
>> | > the MAX_NUM_DLLS error, meaning we'll have to break them down
>> further (or
>> | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff
>> and Bernd
>> | > Bischl to make sure I am representing the issue well.
>> |
>> | This came up *here* in May 2015
>> | and then May 2016 ... did you not find it when googling.
>
> |
>> | Hint: Use
>> | site:stat.ethz.ch MAX_NUM_DLLS
>> | as search string in Google, so it will basically only search the
>> | R mailing list archives
>>
> I did not know this and apologize. I starred this email so I can use it
next time I have a question or request. I did find (and left a comment) on
the stackoverflow question in which you left an answer to this question.
http://stackoverflow.com/a/37021455/2269255
> |
>> | Here's the start of that thread :
>> |
>> |
>>
>>
>> https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html
>> |
>> | There was not a clear conclusion back then, notably as
>> | Prof Brian Ripley noted that 100 had already been an increase
>> | and that a large number of loaded DLLs decreases look up speed.
>
> |
>> | OTOH (I think others have noted that) a large number of DLLs
>> | only penalizes those who *do* load many, and we should probably
>> | increase it.
>>
> Am I correct in understanding that the decrease in lookup speed only
happens when a large number of DLLs are loaded? If so, this is an expected
cost to having many DLLs and one that I, and I would guess other
developers, would be willing to pay to have more DLLs available. If
increasing MAX_NUM_DLLS would increase R's fixed memory footprint a
significant amount then I think that's a reasonable argument against the
increase in MAX_NUM_DLLS.
> |
>> | Your use case of "hyper packages" which load many others
>> | simultaneously is somewhat convincing to me... in so far as the
>> | general feeling is that memory should be cheap and limits should
>> | not be low.
>>
> It should also be pointed out that even in the case of "hyper packages"
like mlr, this is only an issue during unit testing. I wonder if there is
some middle ground here? Would it be difficult to have a compile flag that
would change the number of MAX_NUM_DLLS when compiling R from source? I
believe this would allow us to increase MAX_NUM_DLLS when testing in Travis
and Jenkins while keeping the same footprint for regular users.
> |
>> | (In spite of Brian Ripleys good reasons against it, I'd still
>> | aim for a *dynamic*, i.e. automatically increased list here).
>>
>> Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N'
>> case and no longer a road block for the 'big N' case required by mlr et
>> al.
>>
> This would be nice! Though my concern is the R-core team's time. This is
the best answer, but I don't feel comfortable requesting it because I can't
help with this and do not want to take up R-core's time without a very
significant reason.
Unit testing for a meta-package is a particular case, though I think an
important one which will impact R over the long term. The answers from
least to most complex are something like:
1. Do nothing
2. Increase MAX_NUM_DLLS
3. Compiler flag for MAX_NUM_DLLS ( I actually have no reference to how
difficult this would be)
4. Change to dynamic loading
I'm requesting (2) because I think it's a simple short term answer until
someone has time to sit down and work out (4).
>
>> As a C++ programmer, I am now going to hug my
>>
>> std::vector and quietly retreat.
>>
>
>
> May I humbly request a translation of "std::vector" for people like me who
> are not familiar with C++?
>
>
> I got the following:
>
>
> > install.packages('std')
> Warning in install.packages :
> package ‘std’ is not available (for R version 3.3.2)
>
>
> Thanks,
> Spencer Graves
>
>
>> Dirk
>>
>> | Martin Maechler
>> |
>> | > Regards,
>> |
>> | > Steve Bronder
>> | > Website: stevebronder.com
>> | > Phone: 412-719-1282
>> | > Email: sbronder at stevebronder.com
>> |
>> |
>> | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson <
>> | > henrik.bengtsson at gmail.com> wrote:
>> |
>> | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because
>> some
>> | >> packages don't unload their DLLs when they being unloaded
>> themselves.
>> | >> In other words, there may be left-over DLLs just sitting there
>> doing
>> | >> nothing but occupying space. You can remove these, using:
>> | >>
>> | >> R.utils::gcDLLs()
>> | >>
>> | >> Maybe that will help you get through your tests (as long as
>> you're
>> | >> unloading packages). gcDLLs() will look at
>> base::getLoadedDLLs() and
>> | >> its content and compare to loadedNamespaces() and unregister any
>> | >> "stray" DLLs that remain after corresponding packages have been
>> | >> unloaded.
>> | >>
>> | >> I think it would be useful if R CMD check would also check that
>> DLLs
>> | >> are unregistered when a package is unloaded
>> | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29),
>> but of
>> | >> course, someone needs to write the code / a patch for this to
>> happen.
>> | >>
>> | >> /Henrik
>> | >>
>> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder
>> | >> <sbronder at stevebronder.com> wrote:
>> | >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in
>> from 100 to
>> | >> 500.
>> | >> >
>> | >> > On line 131 of Rdynload.c, changing
>> | >> >
>> | >> > #define MAX_NUM_DLLS 100
>> | >> >
>> | >> > to
>> | >> >
>> | >> > #define MAX_NUM_DLLS 500
>> | >> >
>> | >> >
>> | >> > In development of the mlr package, there have been several
>> episodes in
>> | >> the
>> | >> > past where we have had to break up unit tests because of the
>> "maximum
>> | >> > number of DLLs reached" error. This error has been an
>> inconvenience that
>> | >> is
>> | >> > going to keep happening as the package continues to grow. Is
>> there more
>> | >> > than meets the eye with this error or would everything be okay
>> if the
>> | >> above
>> | >> > line changes? Would that have a larger effect in other parts
>> of R?
>> | >> >
>> | >> > As R grows, we are likely to see more 'meta-packages' such as
>> the
>> | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of
>> DLLs loaded
>> | >> at
>> | >> > any point in time to conduct effective unit tests. If
>> MAX_NUM_DLLS is
>> | >> set
>> | >> > to 100 for a very particular reason than I apologize, but if
>> it is
>> | >> possible
>> | >> > to increase MAX_NUM_DLLS it would at least make the testing at
>> mlr much
>> | >> > easier.
>> | >> >
>> | >> > I understand you are all very busy and thank you for your time.
>> | >> >
>> | >> >
>> | >> > Regards,
>> | >> >
>> | >> > Steve Bronder
>> | >> > Website: stevebronder.com
>> | >> > Phone: 412-719-1282
>> | >> > Email: sbronder at stevebronder.com
>> | >> >
>> | >> > [[alternative HTML version deleted]]
>> | >> >
>> | >> > ______________________________________________
>> | >> > R-devel at r-project.org mailing list
>> | >> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> | >>
>> |
>> | > [[alternative HTML version deleted]]
>> |
>> | > ______________________________________________
>> | > R-devel at r-project.org mailing list
>> | > https://stat.ethz.ch/mailman/listinfo/r-devel
>> |
>> | ______________________________________________
>> | R-devel at r-project.org mailing list
>> | https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-
Steve Bronder
[[alternative HTML version deleted]]
More information about the R-devel
mailing list