[RsR] choosing and plotting a model

Andreas Ruckstuhl rk@t @end|ng |rom zh@w@ch
Tue Nov 30 04:19:54 CET 2010


Dear Pep

To choose between different models you can use anova() (i.e., 
anova.lmrob()) which runs robust tests; cf. example below. Since 
anova.lmrob() can not handle poly(x, ...) directly, you must run some 
manual preparations on the data set (cf. below).

Fitting higher order polynomial can be very tricky. If there are 
outliers at one or both end of the range, it is hard to tell whether one 
should fit a higher order polynomial or better a low order polynomial 
with some outliers at the end of the range. - The goals of your analysis 
may give a hint which solution is more preferred. The data themselves 
are often not decisive.

If there is no physical law supporting a high order polynomial I prefer 
to fit local regression smoother like lowess() or loess.

To plot the models you must install the current version of the package 
'robustbase' (cf. below) where predict.lmrob() is implemented.


All the best, Andreas

--------------------- Example ----------------------------------

ADat <- data.frame(x=1:19)
set.seed(3011)
ADat$y <- 4 + 0.02*(ADat$x - 9)^3 + rnorm(nrow(ADat),0,0.7)
ADat$y[c(5,12)] <- c(15,-5)


## Data Preparation:
X <- model.matrix(~   poly (x,6), data=ADat)
ADat2 <- data.frame(y=ADat$y, X[,-1])
names(ADat2)[-1] <- paste("P", 1:6, sep="")

## Robust Fitting
library(robustbase)
pl1 <- lmrob(y ~  P1, data=ADat2)
pl2 <- lmrob(y ~  P1 + P2, data=ADat2)
pl3 <- lmrob(y ~  P1 + P2 + P3, data=ADat2)
pl4 <- lmrob(y ~  . - P5 - P6, data=ADat2)
pl5 <- lmrob(y ~  . - P6, data=ADat2)
pl6 <- lmrob(y ~  ., data=ADat2)

summary(pl6)

## Model Selection
anova(pl6, pl3, test="Wald")
## or alternatively
anova(pl6, pl3, test="Deviance")## Pr(>chisq) = 0.3652

anova(pl6, pl2, test="Deviance")## Pr(>chisq) = < 2.2e-16


## Displaying Model Fits
## For predict: Please load 'robustbase' from R-forge
## see https://r-forge.r-project.org/R/?group_id=59
plot(y ~x, data=ADat)
lines(ADat$x, predict(pl1), col=2)
lines(ADat$x, predict(pl2), col=3)
lines(ADat$x, predict(pl3), col=4)
lines(ADat$x, predict(pl4), col=5)

----------------------------------------------------------------------------------


------- Ursprüngliche Nachricht --------
Von: Pep Serra <josep.serra using uab.cat>
Gesendet: Mon, 29 Nov 2010 15:52:08 +0100
An: r-sig-robust using r-project.org
Betreff: [RsR] choosing and plotting a model

> Dear all,
> 
> Trying to choose between different curve fittings within different 
> robust models (what we would get with cftools in MATLAB, but I am 
> definetly R-addicted).
> 
> Maybe it is a trivial questions:
> 
> lm1<-lmrob(y ~  x, data=a)
> pl2<-lmrob(y ~   poly (x,2), data=a)
> pl3<-lmrob(y ~   poly (x,3), data=a)
> pl4<-lmrob(y ~   poly (x,4), data=a)
> pl5<-lmrob(y ~   poly (x,5), data=a)
> pl6<-lmrob(y ~   poly (x,6), data=a)
> 
> I want to choose between different models according to the best fit, but 
> the problem is that there is no "best fit" in robust (or R2). How could 
> I do that?
> 
> In addition, I would love to plot these models (scatterplot and than 
> later add the lines to see how the models look like)
> 
> I tried, for instance:
> 
> lines (x,predict (pl2),col="red")
> 
> but I get many curves (I think) or a filled polygon. Is there any way to 
> get a simple line out of the model?
> 
> _______________________________________________
> R-SIG-Robust using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-robust

-- 
----------------------------------------------------------------------

Prof. Dr. Andreas Ruckstuhl
ZHAW Zürcher Hochschule für Angewandte Wissenschaften
IDP Institut für Datenanalyse und Prozessdesign
Rosenstrasse 3                    Tel.  : +41 (0)58 934 78 12
Postfach                          Fax   : +41 (0)58 935 78 12
CH-8401 Winterthur                e-Mail: Andreas.Ruckstuhl using zhaw.ch
                                   WWW   : http://www.idp.zhaw.ch




More information about the R-SIG-Robust mailing list