[RsR] Package name vote --> 'robustbase'

Tue Dec 20 14:38:04 CET 2005

As most of you have probably seen in the mean time,
the name 'robustbase' wone by a large margin:

 robustbase 45
 robustats   9
 robusta     5
 robustat    1

Thanks to all 20 voters!  As a politically active Swiss, I'm
used to voting differently than the majority ;-)

I plan to make the `always current' state of the (source)
package available (by "svn" or "subversion", but also https) at
the same URL as other R packages already are;  I will announce
it here, when it's ready. Probably, it will also make sense to submit it
to CRAN even very early and unfinished, just for the reason that
windows users who can't build R packages from the source, can
easily install the package.

Now we can get to work on it, i.e. putting functionality there.
We might want to really consider Andreas Ruckstuhl's posted
private package (on Dec 7) and his question

ARu> Talking about names: how should we call functions which do
ARu> i.e. robust fitting of a glm: 
ARu>
ARu> 	rfglm  (Robust Fitting of GLM)
ARu> 	rglm
ARu> 	robglm

DATA SETS
---------
Of course, since this package is somewhat focused on the
Maronna-Martin-Yohai book, we should eventually get their
datasets in there, and Ricardo and Victor agreed in Treviso to
provide them eventually.

OTOH, it's quite useful to have data sets available from the
beginning in order to write examples and tests using those data.
I've already asked some individuals about this, but do ask here
in public for useful / sensible / well known and non-large data sets
to be also part of the package; such that the examples (on each
help page!) could make use of those data sets.
I've got already what I call 'Animals2' which is another version
of the "brain vs body weight" data, namely the union of the two MASS
data sets 'Animals' and 'mammals'.
Of course, Rousseeuw & Leroy (1990), contains a few dozens more
data sets, some of which would be interesting. 
Several of them are currently already in Valentin's 'rrcov'
package, and -- if Valentin agrees -- I would propose to just
"mirror them" in the new 'robustbase' package.  Eventually, they
could be removed from 'rrcov', namely at least then when rrcov
would "Depend" on 'robustbase' (i.e. load or attach robustbase
when rrov itself is loaded).

Note BTW that the stackloss data *is* already in the core
package 'datasets' (and I don't understand why at least three
other CRAN packages have *also* provided the stackloss data,
just with slightly different variable names.. ; well, one of
them is package 'MPV' providing all data sets from the book 'Montgomery,
Peck & Vining').
Also, we really don't need data sets that are already in
"standard" or "recommended" R packages; i.e., notably we could
well make use of all those you see from
     data(package = "datasets")  # standard
     data(package = "MASS")      # recommended

you can always use datasets from other packages by
    data(<name>, package = <packagename>)
e.g.  data(mammals, package = "MASS")

Further note:  Apart from univariate data, if there are not very
  good reasons , we'd only want data frames, not matrices or
  single vectors, since the latter can always easily be
  extracted from the data frames. 

Now if you'd consider "donating" data sets to the 'robustbase'
package, please send me two files,
1) a table (*.tab, *.txt or *.csv) file, typically; or a binary *.rda
   file, see the manual "Writing R Extensions", section 'Package
   subdirectories'
2) a *.Rd file as produced from  prompt(<your_dataframe>)
   and edited __by you__ where you have filled in the relevant
   information about the data.

Martin Maechler.

>>>>> "MM" == Martin Maechler <maechler using stat.math.ethz.ch>
>>>>>     on Thu, 15 Dec 2005 17:52:46 +0100 writes:

    MM> I didn't get strong statements for yet another name, so the vote
    MM> "is it" ...  Well, it seems pretty clear at the moment, but
    MM> then, many more people could still vote...

    MM> A simple R script (and a cron job) auto-produces the
    MM> webpage
    MM> http://stat.ethz.ch/~maechler/R-sig-robust-vote.html
    MM> from the data file {that I update manually}.

    MM> I know from politics that it is strictly forbidden to publish
    MM> polls when the vote is still happening {because of the
    MM> well-known "the winner takes it all" effect},
    MM> but then we want to have some fun, too...  ;-)