[R] Problems in "plot.lm" with option "which=5"

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Nov 18 11:37:21 CET 2006


[PLease don't send HTML mail: your message was badly formatted on arrival.
See the posting guide and the double-spaced mess below.]

This _is_ a bug: the author had tried to reorder the factor levels by mean 
fitted value, but forgot to reorder the residuals (etc) to match.  You may 
have noticed that the added line was not monotone, and it should have 
been.

I've fixed this in R-patched (and documented on the help page what was 
supposed to happen).

On Fri, 17 Nov 2006, Gabriela Cendoya wrote:

> Hi:
>
>       I think I found an error in plot.lm with the option which=5, of course I can be wrong , as usually happen,  but I had work on it for a while and show it to some other people that work with R, and so far I don't see what I can be interpreting wrong. I also worked over the plot.lm's code and change some lines to get what I call "the right plot",  if any body is interested I can send the modified code to see what is the problem I think I found and what could be a solution.
>
>
>
> I´m working with R 2.4.0 on windows XP, and here is a reproducible example, (this example is just to show the problems in the plot and it doesn't make any sense the way I analyzed).
>
>
>
> set.seed(3)
>
> datos <-data.frame(fac.A=rep(c("bla","Ur2","pel","arb"),each=3),
>
>                       y= c(rnorm(3,sd=0.5),rnorm(9,sd=2)))
>
> model1 <- lm(y~fac.A,data=datos)
>
> plot(model1,which=5)   # plot1
>
>
>
> # this plot1 show that level "arb" has less dispersion than the other levels,
>
> # But if  I do the plot by myself, look:
>
>
>
> hii <- lm.influence(model1, do.coef = FALSE)$hat
>
> s1 <- sqrt(deviance(model1)/df.residual(model1))
>
> rs <- residuals(model1)/(s1 * sqrt(1 - hii))
>
>
>
> plot(rs~datos$fac.A)   # plot2
>
>
>
> # this plot2 show me that level "bla"  is less variable.
>
> # also per and Url have some problems but this give you the idea of what I think Is wrong.
>
>
>
> What I have found in the code, is that for this option (which=5),  the labels of the x axis are ordered in a way that the predicted value for the levels are increasing, but when it actually do the plot it doesn't keep that order.
>
>
>
> Thanks for your time (and sorry for my English) .
>
>                                     Gabriela
>
> 	[[alternative HTML version deleted]]
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list