[RsR] covrob --- some OOP-comments

Valentin Todorov v@|ent|n@todorov @end|ng |rom che||o@@t
Mon May 22 23:55:49 CEST 2006


I have finally uploaded the new version of rrcov containing the S4 classes 
which previously were in the preview package rr4cov. May be it is necessary 
to write more about it, but for now only briefly:

- unfortunately I still have not removed the classes and data sets that are 
already in robustbase
- I have solved the naming convention and the coexisting of S3 and S4 
classes in the following way:
    - a class name starts with a capital letter
    - a function, generic or method name starts with a lowercase, except for 
the functions returning an S4 objects, which have the same name as the 
corresponding S4 class (aka constructor). Thus we can have parallel covMest 
(function returning S3 object "mest") and CovMest (function) returning a 
CovMest (class) object.
    - In the ..\inst\doc directory I have added UML diagrams depicting the 
class hierarchy, next time also vignette will appear.
    - The new S4 classes have their own show/plot/summary methods
    - later the S3 classes and the corresponding functions will be 
"deprecated", similarly as in Java: i.e. a function which is deprecated 
issues a warning each time when invoked, saying that it is deprecated, in 
one of the following releases will be removed and it is recommended to use 
the function so and so.

- Control parameters: in the same way as above, rrcov.control will be 
deprecated. Now we will have a CovControl virtual base class and derived 
classes like CovControlMest, CovControlMcd, etc. each containing the 
necessary parameters. The "estim" parameter is "substituted" by a generic 
'estimate' and methods 'estimate' in each of the derived classes.

- There was an inconsistency in the returning of the correlation in covMest 
as compared to covMcd - now this parameter and value are removed completely 
and a user wants to use the correlation matrix instead of the covariance, 
she/he can call the accessor method getCorr()  instead of getCov().

best regards,
Valentin



----- Original Message ----- 
From: "Martin Maechler" <maechler using stat.math.ethz.ch>
To: "Valentin Todorov" <valentin.todorov using chello.at>
Cc: <R-SIG-Robust using stat.math.ethz.ch>
Sent: Monday, March 27, 2006 2:41 PM
Subject: Re: [RsR] covrob --- some OOP-comments


>>>>>> "ValenT" == Valentin Todorov <valentin.todorov using chello.at>
>>>>>>     on Sat, 25 Mar 2006 11:18:49 +0100 writes:
>
>    ValenT> ----- Original Message ----- 
>    ValenT> From: "Martin Maechler" <maechler using stat.math.ethz.ch>
>    ValenT> ...
>
>    >> I assume that at least the base class(es) ('Cov', 'CovR' in
>    >> Valentin's naming scheme) should be put into 'robustbase'  ASAP,
>    >> so other packages can import them from the robustbase namespace
>    >> and extend (aka "inherit from") them.
>
>    ValenT> I have uploaded a new version of rrcov with
>    ValenT> constrained M-estimates of location and scatter
>    ValenT> covMest(), for test purposes, which still returns an
>    ValenT> S3 class. Now I am implementing Cov, CovR and the
>    ValenT> derived from them Mest which I want to upload in the
>    ValenT> next days. As soon as this construction is stable,
>    ValenT> I'll move all three of them to robustbase (Martin, I
>    ValenT> rely on your support).
>
> sure.
> As I alluded earlier, I'd also be happy for you to get direct
> write access to the subversion repository, i.e. to the database engine
> which is behind https://svn.R-project.org/R-packages/trunk/robustbase
>
>    ValenT> After that I'll port covMcd to return an S4 class derived from 
> CovR.
>
>    ValenT> Two questions arise:
>
>    ValenT> - is there some "standard" for naming classes. I
>    ValenT> assume the usual starting with a capital letter? In
>    ValenT> some cases one can go further and select a
>    ValenT> particular capital letter. For example in Visual
>    ValenT> C++/MFC every class started with a capital C (in our
>    ValenT> case we would have Ccov, CcovR, Cmcd, Cogk, Cmest).
>
> There are "some" standards, but not endorsed officially;
> particularly there is no capitalization or "prefix"
> standard.  In a 'function based' OO system like S4, the classes
> are a bit less visible than in a 'class based' system like C++/Java.
>
> Several of the more classical R packages that have been using S4
> use simple all-lowercase-alphabet class names such as 'mle' or
> 'pixmap' but also 'sparseMatrix'.  The one ``rule'' that I think
> "everyone" agrees on is  that the creator function, particularly of
> a ``principal'' class, should have the identical
> name as the class it creates. E.g. mle() returns S4 objects of
> class 'mle', Matrix() returns objects inheriting from class
> "Matrix", etc.  However even this rule is sometimes not
> practical for diverse reasons, typically name clashes with
> already existing functionality in R (in possibly other packages, etc).
> One of the main reasons that IMO it doesn't make sense trying to
> impose such standards is the fact that S (and hence R) has a
> history of more than 20 years, and one has wanted to stay back
> compatible as much as possible when providing new facilities.
>
> If we try to adhere to the only "agreed upon" standard above, our
> class would need to be called  "covrob";  its super class (which
> conceptually also contains the classical non-robustly estimated
> covariance structures) could well be called "cov", even though
> the standard cov() function does not return classed objects.
> Further thinking about this directly leads to Valentin's 2nd
> question:
>
>    ValenT> - What happens with the user of, for example
>    ValenT> covMcd() when it begins to return an S4 class "Mcd"
>    ValenT> instead of the current S3 "mcd".
>
>    ValenT> Of course these that just use print/plot/summary
>    ValenT> will not notice the change, but what about these
>    ValenT> that use the returned object within their programs?
>
> Very good point that has also come to my mind when contemplating
> your proposed inheritance / class hierarchy:
>
> All the user's code / scripts / functions that rely on the
> current structue of, say,  covMcd(),  will stop working
> correctly.
>
>    ValenT> This is actually a general question on compatibility.
>
> Indeed!
>
> One approach that I usually favor is to require new function
> names for getting the new-class results *and*
> keep the old functions returning a back-compatible result; in
> the present case keep covMcd() or covOGK() returning the lists
> (and maybe S3 class) they currently return.
>
> That would be one argument pro only having covrob() return an S4
> class and all the underlying "method functions" return lists
> (possibly with an S3 class) -- just about what Peter and Heinz
> have been proposing.  Additionally one could have newly named
> functions {say 'covMCD'} that call the same underlying "method
> functions" as covrob, e.g. call 'covMcd(..)', and only covMCD
> would return an S4 class "covMCD" which extends (or "inherits
> from") "covrob".
>
> The completely alternative approach would be to declare that all
> current users of 'rrcov' (or 'robustbase') should change their
> scripts and functions whenever they start using the
> new_generation-version of "robustbase" and start using the slots
> (or accessor functions were we provide them) of the new
> S4-classed return values.
> This 2nd ``brutal'' approach (of non-compatible upgrade) is
> possible in situations where not too many users of the current
> package exist, and they basically agree to do the extra work of
> upgrading their R scripts.
> Personally, I'm  *very*  reluctant against compatibility
> breaking -- though I agree it has to happen sometimes in order
> to not hinder progress.
>
> Martin




More information about the R-SIG-Robust mailing list