[Rd] problem with zero-weighted observations in predict.lm?

Thu Jul 29 10:11:53 CEST 2010

Peter Dalgaard wrote:
> Prof Brian Ripley wrote:
> 
>> I think you will find that 'n' is used in several ways in predict.lm, 
>> and since NA-handling was introduced in R 1.8.0 they may differ in 
>> value.  So the safest route seems to be to change just 'n' in
>>
>>  		df <- n - p
> 
> Yes, that seems to fix things. Will commit to R-devel shortly.
> 
> -p
> 

Spoke too soon, it fixes Bill's case, but breaks one of the regression
tests!

In fact this goes deeper, summary.lm special-cases the same zero-rank
case by using length(residuals), so it also miscalculates with zero weights:

> fit <- lm(y~0,weights=c(0,rep(1,9)))
> summary(fit)

Call:
lm(formula = y ~ 0, weights = c(0, rep(1, 9)))

Residuals:
     Min       1Q   Median       3Q      Max
-1.95428 -1.40571 -0.42378 -0.05795  1.05518

No Coefficients

Residual standard error: 1.119 on 10 degrees of freedom

----

Hum. lm() actually returns df.residual, AFAICS in all cases, now why
don't we just use that throughout????

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com