[Rd] problem with zero-weighted observations in predict.lm?

Thu Jul 29 16:19:10 CEST 2010

I believe that I now has this nailed down (a couple of further issues raised their head). Committed to r-devel.

-pd

On Jul 29, 2010, at 10:11 AM, Peter Dalgaard wrote:

> Peter Dalgaard wrote:
>> Prof Brian Ripley wrote:
>> 
>>> I think you will find that 'n' is used in several ways in predict.lm, 
>>> and since NA-handling was introduced in R 1.8.0 they may differ in 
>>> value.  So the safest route seems to be to change just 'n' in
>>> 
>>> 		df <- n - p
>> 
>> Yes, that seems to fix things. Will commit to R-devel shortly.
>> 
>> -p
>> 
> 
> Spoke too soon, it fixes Bill's case, but breaks one of the regression
> tests!
> 
> In fact this goes deeper, summary.lm special-cases the same zero-rank
> case by using length(residuals), so it also miscalculates with zero weights:
> 
>> fit <- lm(y~0,weights=c(0,rep(1,9)))
>> summary(fit)
> 
> Call:
> lm(formula = y ~ 0, weights = c(0, rep(1, 9)))
> 
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1.95428 -1.40571 -0.42378 -0.05795  1.05518
> 
> No Coefficients
> 
> Residual standard error: 1.119 on 10 degrees of freedom
> 
> ----
> 
> Hum. lm() actually returns df.residual, AFAICS in all cases, now why
> don't we just use that throughout????
> 
> 
> -- 
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com