[R] Interesting behavior of lm() with small, problematic data sets

JRG loesljrg at accucom.net
Wed Sep 6 15:22:18 CEST 2017


Indeed (version-specific).

With R 3.4.1 on linux, I get coefficients and residuals that are
numerically exact, F-statistic = NaN, p-value = NA, R-squared = NaN, etc.

All of which is what ought to happen, given that the response variable
(y) is not actually variable.


---JRG
John R. Gleason


On 09/06/2017 09:10 AM, S Ellison wrote:
>> I think what you're seeing is
>> https://en.wikipedia.org/wiki/Loss_of_significance.
> 
> Almost. 
> All the results in the OP's summary are reflections of finite precision in the analytically exact solution, leading to residuals smaller than the double precision limit. The summary is correctly warning that it's all potentially nonsense, and indeed the only things you can trust are the coefficient values (to within .Machine$double.eps or thereabouts)
> 
> Interestingly, though, my current version of R (3.4.0) gives numerically exact coefficients (c(1,0) and identically zero standard errors.
> 
> So this particular example is apparently version-specific.
> 
> S Ellison
> 
> 
> *******************************************************************
> This email and any attachments are confidential. Any use...{{dropped:8}}
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list