[RsR] Singular covariance in plot.lmrob
Werner Stahel
@t@he| @end|ng |rom @t@t@m@th@ethz@ch
Fri Dec 5 09:47:11 CET 2008
Dear all,
Thanks to a computational problem we stumble into a discussion
that is more fundamental:
What should leverage be in a model with both categorical and
continuous explanatory variables?
A. If we consider the original definition connecting observed and
fitted values, the categorical variables must remain in the
design matrix on which leverage is calculated.
B. If leverage measures some kind of summary measure of
influence on coefficients, one may argue for separating the
influence on coefficients of continuous Xs from the influence on
estimated effects of factors.
For unbalanced designs, however, this influence depends on a
partialized design matrix, not on the "sub-design" obtained by
considering only the part of the design matrix corresponding to
the continuous variables.
My conclusion: The h_ii (or Mahalanobis distances) calculated
for the sub-design-matrix do not make any clear sense and should
be avoided.
My recommendation:
a. If a high breakdown "covariance estimate" for
the full design matrix fails because of too many singular
elemental subsets (a few ones can simply be ignored as proposed
before) then an M-estimator should be used.
b. If the M-estimator converges to a singular matrix, then
the non-robust sample covariance should be used.
Of course, the output should be clear about the version used.
With some effort, one might try and define a robust version of
the partialized design matrix (see B) and the corresponding
estimator of the "covariance matrix".
Let me add something that may be obvious to all of us:
The problem with the factors also appears for the high breakdown
regression estimator itself. The solution by Maronna and Yohai ?
is to split the problem into continuous and categorical
variables.
Therefore, there may be a treatment of the leverage problem that
corresponds to this estimation procedure.
I plan to think some more about this and communicate any results
from it -- unless somebody tells me that this has been done long
ago ...
Cheers
Werner Stahel
----------------- This message was sent by ---------------------------
Werner Stahel http://stat.ethz.ch/~stahel
Seminar fuer Statistik phone : +41 44 632 34 30
ETH-Zentrum, LEO D8 fax : +41 44 632 12 28
CH-8092 Zurich, Switzerland meet me: Leonhardstr.27, D8
More information about the R-SIG-Robust
mailing list