[RsR] MASS:rlm() and robustbase::lmrob() are both breaking
Ajay Shah
@j@y@h@h @end|ng |rom m@y|n@org
Sun Sep 7 16:13:50 CEST 2008
I have a regression where the lm() goes through fine. This mailing
list has always encouraged me to worry about how a robust regression
might do things differently. I tried two approaches and both don't
work.
First I need to give you the dataset:
> load(url("http://www.mayin.org/ajayshah/tmp/long.rda"))
This gives you a data frame named "long". Here's the simple lm():
> summary(lm(da.g1 ~ -1 + f.year +
+ major.industry +
+ i.x +
+ lta.l1 + I(lta.l1^2), data=long))
Call:
lm(formula = da.g1 ~ -1 + f.year + major.industry + i.x + lta.l1 +
I(lta.l1^2), data = long)
Residuals:
Min 1Q Median 3Q Max
-632.563 -15.405 -5.090 7.797 543.972
Coefficients:
Estimate Std. Error t value Pr(>|t|)
f.year2002 15.94994 4.52330 3.526 0.000424 ***
f.year2003 15.89005 4.50107 3.530 0.000418 ***
f.year2004 19.38506 4.48749 4.320 1.58e-05 ***
f.year2005 23.65796 4.49146 5.267 1.43e-07 ***
f.year2006 32.07334 4.48707 7.148 9.72e-13 ***
f.year2007 35.88498 4.51369 7.950 2.16e-15 ***
major.industryDiversified 1.74538 3.19979 0.545 0.585452
major.industryElectricity 3.61036 6.16091 0.586 0.557887
major.industryFood 1.52626 1.70112 0.897 0.369637
major.industryMachinery -0.15078 1.40149 -0.108 0.914329
major.industryMetals 5.94554 1.66175 3.578 0.000349 ***
major.industryMiscManuf 1.76956 2.17527 0.813 0.415965
major.industryNonMetalMin 1.49889 1.92084 0.780 0.435224
major.industryServ.IT 8.62764 1.86841 4.618 3.95e-06 ***
major.industryServ.Other 6.43315 1.70598 3.771 0.000164 ***
major.industryTextiles 0.07868 1.56312 0.050 0.959859
major.industryTransportEq 4.81549 1.76354 2.731 0.006338 **
i.xTRUE 4.15376 0.97944 4.241 2.26e-05 ***
lta.l1 -3.91434 1.58494 -2.470 0.013546 *
I(lta.l1^2) 0.23105 0.13922 1.660 0.097045 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 34.48 on 6829 degrees of freedom
(1649 observations deleted due to missingness)
Multiple R-squared: 0.2266, Adjusted R-squared: 0.2244
F-statistic: 100.1 on 20 and 6829 DF, p-value: < 2.2e-16
In this, f.year is a factor and major.industry is a factor. i.x is a
boolean. lta.l1 is a real number. The left hand side variable (da.g1)
is a real number. I put a -1 on the regression to make space for the
dummy variables.
On to my woes with robust regressions. MASS:rlm() breaks:
> library(MASS)
> summary(rlm(da.g1 ~ -1 + f.year +
+ major.industry +
+ i.x +
+ lta.l1 + I(lta.l1^2), method="MM", data=long))
Error in rlm.default(x, y, weights, method = method, wt.method = wt.method, :
'x' is singular: singular fits are not implemented in rlm
robustbase::lmrob() breaks:
> library(robustbase)
> summary(lmrob(da.g1 ~ -1 + f.year +
+ major.industry +
+ i.x +
+ lta.l1 + I(lta.l1^2), data=long))
Too many singular resamples
Aborting fast_s_w_mem()
Error in lmrob.S(x = x, y = y, control = control) :
C function R_lmrob_S() exited prematurely
If you could guide me on what I'm doing wrong, that'll be great. How
would I do the above specification as a robust regression? I googled
around and I found a few others asking these same questions in the
past, but it didn't look like there was a clear answer.
--
Ajay Shah http://www.mayin.org/ajayshah
ajayshah using mayin.org http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.
More information about the R-SIG-Robust
mailing list