[RsR] robust cov for n/2 < p < n ~ 1000?

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Feb 1 19:22:28 CET 2007


When reading the paper on LIBRA (2004),
I stumbled (once more) about the fact that there seems to be
missing functionality for the ``in between case'' of
"low" and "high" dimensional data.

For "low dimensional", we have MCD (and similar algorithms)
requiring p < n/2, and the authors recommend even p < n/5 for
covMCD(alpha = 1/2).
For "high dimensional", p > n, the use of robust PCA (and
extensions) is recommended.

For a situation with p = 0.70 n (and think of n = 1000),
we currently have  covOGK() {in different versions, using
different 1d-scale functions etc}, but that's quite slow for the
situation above.  What do people do here?
Is it just a matter of making an implementation covOGK() which
is optimized for speed i.e. by computing in C instead of R code ?

I now think that the "Maronna Method" (said to be based on
Maronna(1976),Annals) may even work faster {than OGK versions}
and even more probably the quadrant correlation
because one only needs p median+MAD and not choose(p,2) ones, there.
Does anyone have experience or "here-say evidence" or recommendations?
I think some versions of these need to go into robustbase.

BTW: It seems S-plus library "robust" has now been ported to an
     R package "robust", mostly --- though accompanied with a
     peculiar Insightful licence.
There's also the "pairwise quadrant correlation"
which is I think a cheap version of the OGK

Martin




More information about the R-SIG-Robust mailing list