[RsR] extracting weights from nlrob
Andreas Ruckstuhl
rk@t @end|ng |rom zh@w@ch
Tue Nov 20 10:40:49 CET 2012
Dear Andy
I would recommend to use lmrob for running polynomial regression. It can
handle outliers on leverage points much better than nlrob.
library(robustbase)
set.seed(54362)
DAT <- data.frame(x=c(seq(-5,5,by=1), 8))
DAT$xq <- DAT$x^2
DAT$y <- 2 + 3*DAT$x + 1.2*DAT$xq + rnorm(nrow(DAT), sd=1.5)
DAT$y[nrow(DAT)] <- 60
DAT.lm <- lm(y ~ x + xq, data=DAT)
summary(DAT.lm)
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.9686 2.4849 2.402 0.039770 *
## x 2.1114 0.5750 3.672 0.005141 **
## xq 0.7117 0.1257 5.663 0.000308 ***
## ---
## Residual standard error: 6.41 on 9 degrees of freedom
DAT.rlm <- lmrob(y ~ x + xq, data=DAT)
summary(DAT.rlm)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.48591 0.47660 3.118 0.0124 *
## x 2.72508 0.08822 30.890 1.91e-10 ***
## xq 1.24015 0.02458 50.457 2.37e-12 ***
## ---
## Robust residual standard error: 1.451
str(DAT.rlm)
(h.rw <- DAT.rlm$weights)
h.sel <- h.rw < 0.1
plot(y~x, data=DAT)
lines(DAT$x, predict(DAT.rlm), col="blue", lwd=2)
lines(DAT$x, predict(DAT.lm), col="red")
points(DAT$x[h.sel], DAT$y[h.sel], pch=16, col="orange")
## I don't know the paper by Motulsky and Brown 2006. I do not recommend
## removing outliers but just recommend identifying outliers and if
## possible correct typos and such things.
DAT.nlrob1 <- nlrob(y ~ a + b*x + c*x^2, start=list(a=6,b=2,c=0.7),
data=DAT)
summary(DAT.nlrob1)
## Formula: y ~ a + b * x + c * x^2
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## a 5.1419 2.0241 2.540 0.03169 *
## b 1.8283 0.4775 3.829 0.00403 **
## c 0.7177 0.1004 7.148 5.38e-05 ***
## ---
## Robust residual standard error: 4.521
## nlrob cannot cope wit outliers on leverage points!
## Hence
h.d <- (DAT$x-median( DAT$x))/mad(DAT$x)
DAT$wx <- 1/sqrt(1+8*pmax(0, h.d^2-1)/sqrt(2))
DAT.nlrob2 <- nlrob(y ~ a + b*x + c*x^2, start=list(a=6,b=2,c=0.7)
,data=DAT,
weights=DAT$wx)
summary(DAT.nlrob2)
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## a 1.81836 0.74151 2.452 0.0366 *
## b 2.69350 0.16763 16.068 6.20e-08 ***
## c 1.19773 0.05919 20.236 8.19e-09 ***
## ---
## Robust residual standard error: 1.417
str(DAT.nlrob2)
DAT.nlrob2$w.r
## [1] 1.00000000 1.00000000 1.00000000 1.00000000 1.00000000 1.00000000
## [7] 1.00000000 0.88186039 1.00000000 1.00000000 1.00000000 0.04762281
## These are more reliable results.
## But be careful. This approach with weighting does not yield a
## high-breakdown estimator.
And by the way:
I guess you will be very careful with expressions like 'DAT$x^2' because
of numerical issues, won't you.
All the best,
Andreas
Am 20.11.2012 09:05, schrieb Andrew Halford:
> Hi Listers,
>
> I have been running a robust 2nd order polynomial regression and I want to
> extract the weights information to see what datapoints are behaving as
> outliers. Although I can see the summary data for the weights as part of
> the output, I have not been able to find any help for extracting all the
> individual weights. I would also like to be able to colour these points on
> a scatterplot if possible.
>
> I also cannot find any R packages that provide a way to calculate and
> remove outliers in nonlinear regressions - along the lines of the paper by
> Motulsky and Brown 2006.
>
> Any advice on this would be most welcome.
>
> Thanks for help.
>
> Andy
>
--
----------------------------------------------------------------------
Prof. Dr. Andreas Ruckstuhl
ZHAW Zürcher Hochschule für Angewandte Wissenschaften
IDP Institut für Datenanalyse und Prozessdesign
Rosenstrasse 3 Tel. : +41 (0)58 934 78 12
Postfach Fax : +41 (0)58 935 78 12
CH-8401 Winterthur e-Mail: Andreas.Ruckstuhl using zhaw.ch
WWW : http://www.idp.zhaw.ch
More information about the R-SIG-Robust
mailing list