[R] R.squared in summary.lm with weights

peter dalgaard pdalgd at gmail.com
Fri Apr 8 13:28:01 CEST 2016


On 08 Apr 2016, at 12:57 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:

> On 07/04/2016 5:21 PM, Murray Efford wrote:
>> Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale:
>> 
>> [...]
>>     f <- z$fitted.values
>>     w <- z$weights
>> [...]
>>             m <- sum(w * f/sum(w))
>>             [mss <-]  sum(w * (f - m)^2)
>> [...]
>> 
>> This seems inconsistent to me. What am I missing?
> 
> I think you are expecting consistency where there needn't be any.  Why do you see an inconsistency here?  Those are different calculations. You get expressions like these if you assume observations have variance sigma^2/w, and you're trying to estimate sigma^2.
> 


It's also perfectly consistent that m is the minimizer of mss:

d/dm sum(w*(f-m)^2) = -2 sum(w*(f-m)) = 0 => m = sum(w*f) / sum(w)

However, beware the distiction between inverse variance weights, replication weights, and sampling weights. 


> Duncan Murdoch
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list