[R] Bug in stepAIC?
Martin C. Martin
martin at martincmartin.com
Thu Oct 12 15:47:10 CEST 2006
Prof Brian Ripley wrote:
> You sent this earlier to R-devel. Please do see the posting guide!
> Since you (incorrectly) thought this was a bug in MASS, you should have
> contacted the maintainer.
Thanks, but I did try emailing both you and Prof. Venables directly a
month ago. After not receiving a response, I emailed R-devel last week.
After not receiving a response there, I thought perhaps the code was
correct after all, and I misunderstood how to call it - a perfect
question for R-help.
There can be a fine line between R-help and R-devel, which is even
harder to find when you're new to R and you don't really know where the
problem is.
> On Wed, 11 Oct 2006, Martin C. Martin wrote:
>
>> Hi,
>>
>> First of all, thanks for the great work on R in general, and MASS in
>> particular. It's been a life saver for me many times.
>>
>> However, I think I've discovered a bug. It seems that, when I use
>> weights during an initial least-squares regression fit, and later try to
>> add terms using stepAIC(), it uses the weights when looking to remove
>> terms, but not when looking to add them:
>>
>> hills.lm <- lm(time ~ dist + climb, data = hills, weights = 1/dist2)
>
> Presumably dist^2?
Yes, sorry, a problem with Thunderbird being a little too smart for it's
own good. :)
>> small.hills.lm <- stepAIC(hills.lm)
>> stepAIC(small.hills.lm, time ~ dist + climb)
>>
>> In the first stepAIC(), it says that the AIC for the full "time ~ dist +
>> climb" is 94.41. Yet, during the second stepAIC, it says adding climb
>> would produce an AIC of 212.1 (and an RSS of 12633.3). Is this a bug?
>
> Yes, but not in stepAIC. Consider
>
>> drop1(hills.lm)
> Single term deletions
>
> Model:
> time ~ dist + climb
> Df Sum of Sq RSS AIC
> <none> 437.64 94.41
> dist 1 164.05 601.68 103.55
> climb 1 8.66 446.29 93.10
>> add1(small.hills.lm, time ~ dist + climb)
> Single term additions
>
> Model:
> time ~ dist
> Df Sum of Sq RSS AIC
> <none> 15787.2 217.9
> climb 1 3153.8 12633.3 212.1
>> stats:::add1.default(small.hills.lm, time ~ dist + climb)
> Single term additions
>
> Model:
> time ~ dist
> Df AIC
> <none> 93.097
> climb 1 94.411
>
> so the bug is in add1.lm, part of R itself. Other code has been altered
> which then broke add1.lm and 'z' needs to be given class "lm". Now
> fixed in r-devel and r-patched.
Great; thanks!
Best,
Martin
More information about the R-help
mailing list