[R] On-demand importing of a package

Gabor Grothendieck ggrothendieck at gmail.com
Fri Nov 25 17:21:26 CET 2011


2011/11/25 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
>
>
> On 23.11.2011 14:59, Gabor Grothendieck wrote:
>>
>> 2011/11/23 Uwe Ligges<ligges at statistik.tu-dortmund.de>:
>>>
>>>
>>> On 23.11.2011 03:18, Gabor Grothendieck wrote:
>>>>
>>>> On Tue, Nov 22, 2011 at 3:16 PM, Gábor Csárdi<csardi at rmki.kfki.hu>
>>>>  wrote:
>>>>>
>>>>> Dear All,
>>>>>
>>>>> in some functions of my package, I use the Matrix S4 class, as defined
>>>>> in the Matrix package.
>>>>>
>>>>> I don't want to depend on Matrix, however, because my package is
>>>>> perfectly fine without Matrix, most of the functionality does not need
>>>>> Matrix. Matrix is so included in the 'Suggests' line.
>>>>>
>>>>> I load Matrix via require(), from the functions that really need it.
>>>>> This mostly works fine, but I have an issue now that I cannot sort
>>>>> out.
>>>>>
>>>>> If I define a function like this in my package:
>>>>>
>>>>> f<- function() {
>>>>>  require(Matrix)
>>>>>  res<- sparseMatrix(dims=c(5, 5), i=1:5, j=1:5, x=1:5)
>>>>>  y<- rowSums(res)
>>>>>  res / y
>>>>> }
>>>>>
>>>>> then calling it from the R prompt I get
>>>>> Error in rowSums(res) : 'x' must be an array of at least two dimensions
>>>>>
>>>>> which basically means that the rowSums() in the base package is
>>>>> called, not the S4 generic in the Matrix package. Why is that?
>>>>> Is there any way to work around this problem, without depending on
>>>>> Matrix?
>>>>>
>>>>> I am doing this on R 2.14.0, x86_64-apple-darwin9.8.0.
>>>>>
>>>>
>>>> Try adding these three lines to the package:
>>>>
>>>> rowSums<- function(x, na.rm = FALSE, dims = 1L) UseMethod("rowSums")
>>>> rowSums.dgCMatrix<- Matrix:::rowSums
>>>> rowSums.default<- base::rowSums
>>>>
>>>
>>>
>>> Folks, please not, just import relevant functionality from the
>>> *recommended*
>>> package Matrix.
>>> Messing around even more is certainly less helpful than importing
>>> relevant
>>> part from a Namespace/package that you will use anyway.
>>>
>>
>> The real problem is how to deal with conditional dependencies and
>> importing is just as much a kludge as anything else.  In the problem
>> under discussion it has the undesirable property that Matrix is always
>> imported even though its almost never needed.
>>
>> Additional conditional dependency features may be needed in R.  All
>> the scenarios in which conditional dependency are involved need to be
>> thought about since there may be interaction among them.
>>
>> Some features might be:
>>
>> - dynamically import another package.
>> - uncouple package installation and loading.  Right now
>> install.packages has a dep= argument that causes the Suggests packages
>> to be installed too.  There should be some way for the package
>> developer to specify this rather than make the user specify it.  For
>> example, if Matrix were not a recommended package and most users
>> wanted to use it in the problem above but a few wanted to use a
>> package that conflicts with it then it would be nice if the package in
>> question could force dep=TRUE without having the user do it.  For
>> example, perhaps there would be an
>>   Installs: Matrix
>
>
> Errr, if I understand this correctly, your arguments are now orthogonal to
> your original comments.
>
> Before you told us it is important to be able to run stuff without having
> Matrix available or just load on demand since it may not be available to the
> users. Now you tell us you want to make it available without having any need
> to use it?
>

I was framing this in terms of the Matrix example, but perhaps its
easier to understand with the actual example which motivated this for
me.  That is, the feature is that whenever sqldf is installed then
RSQLite is installed too without having RSQLite automatically load
when sqldf loads.

Currently the only way to arrange that is to put RSQLite into Suggests
and then instruct the user to use install.packages(..., dep = TRUE),
say.   The problem with that is that it burdens the user with this
installation detail.

sqldf nearly always uses RSQLite so it should be installed when sqldf
is without the user having to do anything special.  We don't know at
install time whether RSQLite will be used or not but are willing to
have it unnecessarily installed even if its not needed in order to
make it easier for the majority who do use it.

However, just because RSQLite is installed does not mean that we want
RSQLite to be loaded automatically too.  sqldf can determine whether
the user wants to use the sqlite backend or one of several other
backends and require() RSQLite or not depending on whether its
actually to be used in that session.

Currently, if RSQLite is in Depends then its always loaded and if its
in Suggests then we can't be sure its been installed so neither of
these work the way we want.  The two things are tied together (i.e.
coupled) but here we want to separate them.  We always want RSQLite to
be installed without making the user specify it on the
install.packages() call yet we want the ability to dynamically
require() it rather than have it automatically loaded when sqldf is
loaded.

One way this might be implemented would be to have an Installs: line,
say, in the DESCRIPTION file which lists packages which are to be
installed at the same time but not automatically loaded.   It would be
the same as Depends except Depends also loads the package whereas
Installs does not -- it only installs the dependency and the package
itself has to require it if it wants it loaded.

The dynamic part is currently only possible if we use Suggests but
that forces the user rather than the package developer to specify
whether to install it.

(One variation of this is that Installs might only specify that the
dependency is installed by default but the user could still override
it on the install.packages() call by specifying not to install it.)

The point here is that the loading is dynamic but installation always occurs.

The Matrix situation was also a situation where dynamic action is
important.  Its not identical to the sqldf case but I was mentioning
them both in case there were any interaction among them since the
generic category of dynamic action for package installation and
loading might be considered together in case there is interaction
among features.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list