[RsR] covrob --- some OOP-comments

Mon Mar 27 20:50:09 CEST 2006

Hi Heinrich,

I thought you might be interested in taking a look at the cov and  
covRob functions in the (Insightful*) Robust Library.  I am in the  
process of updating these so that they work under both R and S-Plus.   
You can find a snapshot of the project here.

   www.stats.ox.ac.uk/~konis/robust/robust.tar.gz

It's not in a very user friendly form yet but it should work on 32- 
bit Linux (that's what I have been using anyways) if you do the  
following.

% unpack robust.tar.gz
% cd robust/src
% make -f Makefile
% cd ..
% R
<...>
 > library(lattice)
 > source("build.q")
 > runif(1)
 > # make a covRob object
 > rob <- covRob(woodmod.dat)
 > # make a cov object
 > cls <- cov(woodmod.dat)
 > # try the print, summary and plot methods
 > # fit both classical and robust at the same time
 > fm <- fit.models(list(Robust = "covRob", Classical = "cov"), data  
= woodmod.dat)
 > # try the print summary and plot methods on these

You may also want to take a look at how the control parameters can be  
passed to covRob.  It uses a method quite similar to the one you  
describe.

Hope this is helpful.

Kjell

*Note that this code has not been released under the *GPL yet.  It  
should be soon but they've been telling me that for a while now.  BTW  
the license is in license.txt and claims to be an open source license  
but I don't actually want to read it.

On 26 Mar 2006, at 22:34, Heinrich Fritz wrote:

> Dear Peter, Valentin and Martin,
>
> Thanks for your comments on the package covrob!!
>
> Your suggestions are very constructive, we already updated the
> implementation according to your remarks. The new version is now  
> online
> at
>
> http://www.statistik.tuwien.ac.at/rsr/groups/mva.html
>
> with the following changes:
>
> (*) The class has been renamed (from covStruct) to cov
>
> 	Further there is a covR - class available (derived from cov). At the
> moment it does not include any other slots.
>
> (*) S3 - S4 mix:
> This problem has been resolved.
>
> The following functions are still available for the (S4) cov object:
>
> plot <taking one or two objects>
> print
> summary
>
> (*) I have improved the argument - passing.
>
> The reason why I chose to use the ... implementation was the  
> following:
>
> Some cov estimation methods (covOGK) do not take a "control" -  
> argument
> for passing input arguments. The covrob wrapper should work with as
> many existing cov - methods as possible (without changing those
> methods!!!) and in this case passing arguments is only possible by the
> ... syntax. However if a cov - method (like covMest or covMcd) takes a
> control - argument for passing other arguments this control structure
> will be passed anyway.
>
> I have now explicitly added the control - argument to covrob.
>
> My suggestion for passing arguments via a control - structure is to  
> use
> a simple list. In my opinion it is not really comfortable for users to
> instantiate a new class for (almost) every single call of covrob.
>
> Further the only disadvantage by using a simple list is, that since no
> slots (and slot - data types) are defined the user could pass anything
> via the control structure. This has to be caught anyway because it is
> always possible to call an function with invalid arguments (e.g. cov
> (testdata, alpha = "asd"))
>
> So the now implemented solution would work the following way:
>
> covrob (testdata, method = "<estimator>", control = list (argument1 =
> .., argument2 = ..))
>
> or this would work too:
>
> covrob (testdata, method = "<estimator>", argument1 = .., argument2  
> = ..)
>
> The problem with the second version is, that there may be naming
> problems (as mentioned by XX)- (e.g. a cov - estimator takes an
> argument called "method") but this is only for assuring, that covrob
> works with existing cov estimators which do not take a control -
> structure. (e.g. the author does not know anything about the covrob -
> package.)
>
> (*) Further I have implemented a function for applying the control -
> object to the input arguments..
>
> function (x, a1 = <default 4 a1>, a2 = <default 4 a2>, a3 = <default 4
> a3>, <further arguments>, control)
> {
> 	if (!missing (control))
> 	{
> 		if (!is.null(control$a1))		a1 = control$a1
> 		if (!is.null(control$a2))		a2 = control$a2
> 		if (!is.null(control$a3))		a3 = control$a3
> 		# other arguments..
> 	}
> 	....
> }
>
> would then be
>
> function (x, a1 = <default 4 a1>, a2 = <default 4 a2>, a3 = <default 4
> a3>, <further arguments>, control)
> {
> 	if (!missing (control))
> 		ParseControlStructure (control)
> 	....
> }
>
> This was possible by the (very excellent) code from Peter
>
>>> eval.parent(substitute( object using distance <- distanceValues ))
>
> I don't know if it is the common way of passing arguments by  
> reference,
> but this is very powerful!
>
> The advantage of this approach is that the control - structure does  
> not
> need to contain all arguments which can be passed to the function but
> only those which should really be set.
>
> An example of the function ParseControlStructure can be found in the
> corresponding help
> file.
>
> (*) Accesssor functions
>
> I have implemented several accessor functions:
>
> cov
> cor			returning (first time: calculating) the correlation matrix
> center
> method		returning the name of the estimator..
> details		returning the whole output - object of the cov -  
> estimation method
> datadim
> mah			mahalanobis distances.
> mah.wt		only calculating the mah - weights (as Martin Maechler  
> proposed.)
>
> These are generic functions - I hope this is the way it was intended!
>
> (*) Mahalanobis distances.
>
> They are now taken from the output object of the cov - method (if  
> available)
>
> (*)
>
> * from Martin Maechler: *
>>
>> 1. In general I think we should have a bit less "optional"
>>   parts; in particular, top of page 4, I think one could
>>   require 'method'.>    require 'method'.
>
> Again: We wanted to be compatible to as many (existing) covariance
> estimation methods as possible. I don't believe that it is very
> expensive to check whether a method-string has been returned or not.
>
> (*)
>
>> '3.1  arguments of covrob()' :
>>   x: should be a numeric matrix  *or a data frame*
>
> This has been implemented anyway. It was not described correctly in  
> the pdf.
>
> (*)
>
> * from Valentin Todorov *
>
>> - What happens with the user of, for example covMcd() when it  
>> begins to
>> return an S4 class "Mcd" instead of the current S3 "mcd". Of  
>> course these
>> that just use print/plot/summary will not notice the change, but  
>> what about
>> these that use the returned object within their programs? This is  
>> actually a
>> general question on compatibility.
>
> This was exactly the reason why we chose the wrapping - solution. It
> should work (and generate S4 output) using available functions because
> we assumed that it is impossible to change the output of functions
> which are already published and in use by others. Code, relying on
> these functions to produce the output described in the help - pages,
> would not work anymore. Everybody using these functions would have to
> change his code. So I think the wrapping - version is the solution  
> with
> the fewest compatibility problems.
>
> (*) I've kicked the classical estimation.
> Instead I've implemented an estimator "cov.classic"
>
> best regards,
>
> Heinrich
>
> _______________________________________________
> R-SIG-Robust using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-robust