[RsR] estimators based on random samples... - should be random
Matias Salibian-Barrera
m@t|@@ @end|ng |rom @t@t@ubc@c@
Mon May 1 19:19:25 CEST 2006
Hello,
Thanks Martin for (once again!) taking the lead in sparking a
discussion. My comments are inserted below.
> In R, we have always adhered to the convention, that such
> estimators should use R's random number generators (=: RNGs) and
> hence their result will be a function of the initial random seed --
> .Random.seed in S and R, typically set via set.seed().
A good convention, IMHO.
> The current algorithm implmentations in 'robustbase' however do
> not adhere to the convention, but rather use an own (cheap) RNG
> [covMcd(), ltsReg()] or the RNG provided by the operating system
> C library rand() function [lmrob()] --- and in all these cases,
> always use the same random seed, by default.
I believe this (each algorithm using its own or the operating system's
RNG) is merely due to the "atomized" nature of the development of the
separate pieces of code that are now in robustbase, and does not reflect
an "a priori design criteria".
> Of course, this has the advantage that all your students get the
> same estimates for the same data (well, at least on the same
> computer hardware and software combination), but I think we
> should switch to using R's RNGs and have all these results
> properly depend on the current random seed, i.e. typically only
> give the same results after the set.seed(<n>) call.
Probably the most noticeable effect of this change would be that in some
cases consecutive calls to fit the same model on the same data may yield
different results, and high levels of anxiety on the "uninitiated" user
will surely follow...
I guess if the convergence criteria of these algorithms is sufficiently
tight then this will typically happen only on those cases where the
existence of two (or more) solutions is actually informative (and
probably relevant for the analysis). Maybe somebody has had other
experiences?
I second Martin's suggestion, but add that we accompany this change with
good examples (one for each model?) on the documentation illustrating
how different solutions can yield more insight on the analysis.
Matias
--
______________________________________________________________
Matias Salibian-Barrera - Department of Statistics
University of British Columbia - matias using stat.ubc.ca
Phone: (604) 822-3410 - Fax: (604) 822-6960
More information about the R-SIG-Robust
mailing list