[RsR] Distribution of robust distances

Valentin Todorov v@|ent|n@todorov @end|ng |rom che||o@@t
Tue Aug 28 23:52:34 CEST 2007


Dear Harry,

Thank you very much for this question , which is an important issue arising 
now and then in different forms but almost allways meaning "Why are 
different the results of the different MCD implementations?". And the answer 
is almost always "Because of the different consistency and small sample 
corrections factors used".

There are several problems in your case (and Kevin Wright's):

1. I assume you are using cov.rob() or cov.mcd() from MASS for computing the 
MCD estimator (as seen in the code of Kevin). These functions return the 
reweighted MCD covariance matrix, while the results in the paper of Hardin 
and Rocke are for the raw MCD. There is an MCD program at the Jo Hardin's 
web page for computing the MCD, which is a straightforward implementation of 
FAST-MCD in R without partitioning, nesting, reweighting and correcting, 
which they used for performing the computation. With this program you could 
reproduce the results but unfortunately it is very slow compared to the 
implementations in native Fortran or C code (like these in MASS, 
rrcov/robustbase). Here is the link:

http://pages.pomona.edu/~jsh04747/Research/mcd.est.r

2. If you use covMcd{robustbase} or CovMcd{rrcov} instead, you can: (i) 
switch off the correction factors and (ii) take the raw estimates.

3. Youd do not need to estimate c (page 19) since it is already applied in 
covMcd() - the covariance matrix was devided by c, i.e. you are multiplyung 
twise the distances by this factor.

In summary: using the raw estimates from covMcd() called with 
use.correction=FALSE and setting c=1 in the code of Kevin will reproduce the 
results.



covResult <- covMcd(x, use.correction=FALSE)

T <- covResult$raw.center

C <- covResult$raw.cov



c <- 1

.

m <- .

.

I'll try to find the code of the simulations I did some time ago and will 
post it in the next days.

Hope this helps,
Best regards,
Valentin

----- Original Message ----- 
From: "Southworth, Harry" <Harry.Southworth using astrazeneca.com>
To: <R-SIG-robust using stat.math.ethz.ch>
Sent: Tuesday, August 28, 2007 4:21 PM
Subject: [RsR] Distribution of robust distances


> Hello.
>
> Has anyone implemented anything to compute quantiles of the distribution
> of robust distances following Hardin & Rocke
> (http://dmrocke.ucdavis.edu/papers/HardinRocke2005.pdf)?
>
> I've got a function to do it, but I can't reproduce the results of
> Hardin & Rock because my function is returning values that are too high.
> Searching the R help archive, I found a message from 2004 describing the
> same problem(http://tolstoy.newcastle.edu.au/R/help/04/05/1296.html).
> The code in that message is essentially similar to mine (except that he
> uses a 1 - h/n that I think should be h/n).
>
> I'd be grateful of any pointers.
>
> Thanks,
> Harry
>
> _______________________________________________
> R-SIG-Robust using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-robust
>




More information about the R-SIG-Robust mailing list