[Rd] One for the wish list - var.default etc

Wed May 9 18:12:35 CEST 2007

Jeffrey J. Hallman wrote:
> Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
> 
>> On Wed, 9 May 2007, S Ellison wrote:
>>
>>> Brian,
>>>
>>>> If we make functions generic, we rely on package writers implementing 
>>>> the documented semantics (and that is not easy to check).  That was 
>>>> deemed to be too easy to get wrong for var().
>>> Hard to argue with a considered decision, but the alternative facing 
>>> increasing numbers of package developers seems to me to be pretty bad 
>>> too ...
>>>
>>> There are two ways a package developer can currently get a function 
>>> tailored to their own new class. One is to rely on a generic function to 
>>> launch their class-specific instance, and write only the class-specific 
>>> instance. That may indeed be hard to check, though I would be inclined 
>>> to think that is the package developer's problem, not the core team's. 
>>> But it has (as far as I know today ...?) no wider impact.
>> But it does: it gives the method privileged access, in this case to the 
>> stats namespace, even allowing a user to change the default method
>> which namespaces to a very large extent protect against.
>>
>> If var is not generic, we can be sure that all uses within the stats 
>> namespace and any namespace that imports it are of stats::var.  That is 
>> not something to give up lightly.
> 
> No, but neither is the flexibility afforded by generics. What we have here is
> a false tradeoff between flexibility vs. the safety of locking stuff down. 

   Yes, that is precisely one of the points, and as some of us recently 
experienced, a reasonably dedicated programmer can over-ride any base 
function through an add-on package. It is, in my opinion a bad idea to 
become the police here.

   AFAIK, Brian's considered decision, was his, I am aware of no 
discussion of that particular point of view about var (and as noted 
above, it simply doesn't work), it also, AFAICS confuses what happens 
(implementation) from what should happen (which is easy to do, because 
with most of the methods, either S3 or S4 there is very little written 
about what should happen).

   That said, there has been some relatively open discussion on one 
solution to this problem, and I am hopeful that we will have something 
in place before the end of July.

   A big problem with S4 generics is who owns them, and what seems to be 
a reasonable medium term solution is to provide a package that lives 
slightly above base in the search path that will hold generic functions 
for any base functions that do not have them. Authors of add on packages 
can then at least share a common generic when that is appropriate. But 
do realize that there are lots of reasons to have generics with the same 
name, in different packages that are not compatible, and normal scoping 
rules apply. For example the XML package has a generic function addNode, 
as does the graph package, and they are not compatible, nor should they 
be. Anyone wanting to use both packages (and I often do) needs to manage 
the name conflicts (and that is where namespaces are essential).

best wishes
   Robert

> 
> The tradeoff is false because unit tests are a better way to assure safety.
> If the major packages (like stats) had a suite of tests, a package developer
> could load his own package, run all the unit tests, and see if he broke
> something.  If it turns out that he broke something that wasn't covered by the
> tests, he could create a new test for that and submit it somewhere, perhaps
> on the R Wiki. 
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org