[RsR] Singular covariance in plot.lmrob

Christian Hennig chr|@h @end|ng |rom @t@t@@uc|@@c@uk
Fri Nov 28 14:01:39 CET 2008


Dear list,

I have come across several situations in which the robust Mahalanobis distance 
vs. residuals plot, the first default plot in plot.lmrob, gave an error like 
this:

# recomputing robust Mahalanobis distances
# The covariance matrix has become singular during
# the iterations of the MCD algorithm.
# There are 14 observations (in the entire dataset of 392 obs.) lying on
# the hyperplane with equation a_1*(x_i1 - m_1) + ... + a_p*(x_ip - m_p)
# = 0 with (m_1,...,m_p) the mean of these observations and coefficients
# a_i from the vector a <- c(-0.0102123, 0, 0, 0, 0, -0.9999479)
# Error in solve.default(cov, ...) :
#   system is computationally singular: reciprocal condition number = 
2.33304e-3

This particular error has been produced with the Auto-mpg dataset from
http://archive.ics.uci.edu/ml/datasets.html

autod <- read.table("auto-mpg.data",col.names=c("mpg","cylinders",
                 "displacement","horsepower","weight","acceleration",
                 "modelyear","origin","carname"),na.strings="?")
autoc <- autod[complete.cases(autod),]
auto17 <- autoc[,1:7]
rautolm <- lmrob(mpg~cylinders+displacement+horsepower+weight+acceleration+
              modelyear,data=auto17)
plot(rautolm)
(I don't claim that this is the most reasonable thing to do with these data 
because of nonlinearity, anyway...)

This problem happens easily if at least one of the variables is discrete and 
there are several observations with the same value.
Such a situation is by no means atypical and therefore I think that it's 
worthwhile that something is done about this, for example checking singularity
internally and in that case trying a different initial sample. 
It may also make sense to give the option that the robust covariance matrix is 
tuned down to 25% breakdown, say, because one may still want to see a bit if 
half of the data lie on a lower dimensional hyperplane (in case of a binary 
x-variable) but regression still makes sense.

Best regards,
Christian

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish using stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche




More information about the R-SIG-Robust mailing list