[R--gR] Proposed design for gR from a machine learning perspective

Martin Maechler maechler at stat.math.ethz.ch
Wed Oct 1 12:05:32 CEST 2003


Hi Kevin,

>>>>> "Kevin" == Kevin Patrick Murphy <murphyk at ai.mit.edu>
>>>>>     on Tue, 30 Sep 2003 13:33:38 -0400 (EDT) writes:

    Kevin> I have written a paper on my proposed design for gR,
    Kevin> based on my experience with developing BNT (the Bayes
    Kevin> Net Toolbox for Matlab). I presented a short version
    Kevin> of this paper at the recent gR workshop.

    Kevin> The paper is available from
    Kevin> http://www.ai.mit.edu/~murphyk/Papers/gr03.pdf

I'm far from being really involved in gR, but have had a very
cursory look at this very interesting paper, particularly the
first few pages. 

One thing, I think more important than you imply, is to really work
with S4 classes -- and to build on the (still evolving!!)
"graph" class (in the package of the same name from
bioconductor) if possible -- maybe getting envolved in the
"graph" class design itself, since that's not really finalized AFAICS.

On bottom of page 4, you write

 >> The graph class should presumably be compatible with the R graph
 >> package14, although this does not seem to support mixed
 >> directed-undirected graphs. 

Good point!  Do mention it early to the "graph" package
authors --> I have CC'ed Robert Gentleman, the "graph" package
maintainer. 

>> Also, I think it is better to represent graphs using a
>> lightweight structure, such as a C-style struct, 

I understand the temptation, as a programmer am also still much
more comfortable with it, but still am convinced this is wrong
here.

>> rather than a heavy class with lots of unrelated methods, which the
>> current R graph class suffers from. 
>> Using a struct rather
>> than a class means that functions (unlike methods) can be
>> added independently by different developers (see discussion
>> in Section 6), and can be grouped/stored according to
>> functionality, rather than forcing them all to be centrally
>> stored in the graph class. (Imagine adding all the functions
>> in Figure 12 to the graph class definition! Why should
>> someone who never uses the junction tree algorithm have to
>> see these functions?)
	
Here you are really misguided by another conception of OO
programming than the one that the S dialects (R and S-plus) use.
In your concept (which is close to that of Java, C++, Perl or
Python), all methods belong to a class.

However in S (incl S4) OOP, you'd rather think that the methods
"belong" to a generic function than to a class {in some way they
belong to both}.  Together with class "extension"/"inheritance"
this leads to the possibility of very modular code development:

E.g., you write methods for new generic functions for which
there are not yet methods in the "graph" package;
or you extend a class and provide a method for the extended
class that differs from the method of the "mother class" (where
all other methods will be inherited from the "mother").
I'm really very far from being an expert on this topic, and have
included *the* expert in this e-mail's CC.

On this topic, please carefully read  help(Methods) and maybe
even the "green book" (Chambers 1998) mentioned there.
The other reference on the "Methods" help page,
    http://developer.r-project.org/methodsPackage.html
seems not entirely up-to-date to me, and I assume John Chambers
can point you to something even better.

I do hope you will get more feedback on the real "gR" topics in
your paper on which I'm not competent.
Thank you for this interesting contribution!

Regards,
Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><




More information about the R-sig-gR mailing list