[R] Lack of Fit test
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Feb 23 16:05:44 CET 2000
> From: "Alan T. Arnholt" <arnholt at math.appstate.edu>
> To: Bill Venables <William.Venables at cmis.CSIRO.AU>
> Cc: r-help at stat.math.ethz.ch, arnholt at math.appstate.edu
> Subject: Re: [R] Lack of Fit test
> Date: Wed, 23 Feb 2000 09:40:21 -0500 (EST)
> X-Authentication: none
>
>
> I guess my question was not adequately stated when I sent it to the list. I
was
> inquiring to see if anyone had written code to perform a lack of fit test in
the
> special case when you have replicate predictors. If your predictors contain
> replicates (repeated x values with one predictor or repeated combinations of x
> values with multiple predictors), you can easily calculate a pure error test
for
> lack of fit. The error term will be partitioned into pure error (error within
replicates)
> , and a lack of fit error and the F-test can be used to test if you have
chosen an
> adequate regression model. See Neter, Kutener, Nachtsheim, and Wasserman
fourth edition
> page 115, or Draper and Smith. Bill Venables wrote "...It makes it
> impossible to write code to do it automatically, but if you know
> what you are doing, the procedure is simple with the software you
> have. As with so many things in statistics, it is not a matter
> of good software so much as of having a good understanding of the
> problem in hand." I guess I am not sure what "if you know what you are doing
the
> procedure is simple..." means since I clearly know what I am doing in
reference to
> the statistical procedure. Where I need help is not with the statistics, but
rather
> with automating the procedure in R.
That's easy. Suppose your data frame x has some column, say, ID, which
identifies the various cases, and you fitted
fit1 <- lm(y ~ rhs, data=df)
Now do
fit2 <- lm(y ~ factor(ID), data=df)
anova(fit1, fit2, test="F")
e.g.
set.seed(123)
df <- data.frame(x = rnorm(10), ID=1:10)[rep(1:10, 1+rpois(10, 3)), ]
df$y <- 3*df$x+rnorm(nrow(df))
fit1 <- lm(y ~ x, data=df)
fit2 <- lm(y ~ factor(ID), data=df)
anova(fit1, fit2, test="F")
Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
1 23 26.101
2 15 15.222 8 10.878 1.3399 0.2975
Despite Bill's sound comments, there is an R package lmtest on CRAN,
which is full of tests for linear models.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list