[RsR] cov.mcd vs covMcd - why the difference?

S Ellison S@E|||@on @end|ng |rom LGCGroup@com
Sat Aug 22 04:06:34 CEST 2015


cov.mcd in MASS and covMcd in robustbase both calculate the Minimum Covariance Determinant (MCD) multivariate location and scale estimator.

For a 2-variable problem, the two return identical correlation matrices but the covariance matrices differ. For the 2-variable case the difference seems to be a scaling factor, equal to the product of scaling factors returned as 'cnp2' in the 'mcd' object returned by robustbase. In the modified ?covMcd example below, the robustbase covariances are appreciably larger as a result. For multivariate problems (n>2) things aren't so simple.

Is there a rationale for the difference? And of the two implementations, which might be the more defensible estimate of covariance for modestly outlier-contaminated data?


S Ellison

#Example (reduced to 2 variables)
library(robustbase)
library(MASS)
 data(hbk)
 hbk.x <- data.matrix(hbk[, 1:2])
 set.seed(17)
 (cH <- covMcd(hbk.x, cor=TRUE))$cov

(cH.M <- cov.mcd(hbk.x, cor=TRUE))$cov
 
cH$cov/cH.M$cov

prod(cH$cnp2)


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}




More information about the R-SIG-Robust mailing list