[R] how to get perfect fit of lm if response is constant

Ista Zahn istazahn at gmail.com
Fri Jan 8 22:11:55 CET 2010


Just to clarify this point: I don't think the problem is that y is
"perfectly fittable", but that it is constant. Since the variance of a
constant is zero, there is no variance to explain.

-Ista

On Fri, Jan 8, 2010 at 2:32 PM, Jan-Henrik Pötter <henrik.poetter at gmx.de> wrote:
> Thanks for the answer.
> The situation is that I don't know anything of y a priori. Of course I then would not do a regression on constant y's, but isn't it a problem of stability of the algorithm, if I get an adj RSquare of 0.6788 for
> a least square fit on this type of data? I think lm should give me a correct result even in case of y is perfectly fittable, because the situation is that I never know if my data could become so. If I have to offset y in this case, then my question becomes how noisy do my y's have to be, so that I can rely on the lm result, if I specify the formula y~x without offset. What if my y's become nearly linear (or nearly perfect fittable with another linear model). I think my question now becomes 'how to rely on lm's result if the formula is specified the way y~x without offset? or 'How do I prevent my result to become numerically incorrect if I may get nearly perfect fittable y's'.
>
> Greetings
>
> Henrik
>
>
> -----Ursprüngliche Nachricht-----
> Von: Peter Ehlers [mailto:ehlers at ucalgary.ca]
> Gesendet: Freitag, 8. Januar 2010 19:44
> An: Jan-Henrik Pötter
> Cc: r-help at r-project.org
> Betreff: Re: [R] how to get perfect fit of lm if response is constant
>
> You need to review the assumptions of linear models:
> y is assumed to be the realization of a random variable,
> not a constant (or, more precisely: there are assumed to
> be deviations that are N(0, sigma^2).
>
> If you 'know' that y is a constant, then you have
> two options:
>
> 1. don't do the regression because it makes no sense;
> 2. if you want to test lm()'s handling of the data:
>
> fm <- lm(y ~ x, data = df, offset = rep(1, nrow(df)))
>
> (or use: offset = y)
>
>  -Peter Ehlers
>
> Jan-Henrik Pötter wrote:
>> Hello.
>>
>> Consider the response-variable of data.frame df is constant, so analytically
>> perfect fit of a linear model is expected. Fitting a regression line using
>> lm result in residuals, slope and std.errors not exactly zero, which is
>> acceptable in some way, but errorneous. But if you use summary.lm it shows
>> inacceptable error propagation in the calculation of the t value and the
>> corresponding p-value for the slope, as well R-Square – just consider the
>> adj R-Square of 0.6788! This result is independent of which mode used for
>> the input vectors. Is there any way to get the perfect fitted regression
>> curve using lm and prevent this error propagation? I consider rounding all
>> values of the lm-object afterwards to somewhat precision as a bad idea.
>> Unfortunately there is no option in lm for calculation precision.
>>
>>
>>
>>> df<-data.frame(x=1:10,y=1)
>>
>>> myl<-lm(y~x,data=df)
>>
>>
>>
>>> myl
>>
>>
>>
>> Call:
>>
>> lm(formula = y ~ x, data = df)
>>
>>
>>
>> Coefficients:
>>
>> (Intercept)            x
>>
>>   1.000e+00    9.463e-18
>>
>>
>>
>>> summary(myl)
>>
>>
>>
>> Call:
>>
>> lm(formula = y ~ x, data = df)
>>
>>
>>
>> Residuals:
>>
>>        Min         1Q     Median         3Q        Max
>>
>> -1.136e-16 -1.341e-17  7.886e-18  2.918e-17  5.047e-17
>>
>>
>>
>> Coefficients:
>>
>>              Estimate Std. Error   t value Pr(>|t|)
>>
>> (Intercept) 1.000e+00  3.390e-17 2.950e+16   <2e-16 ***
>>
>> x           9.463e-18  5.463e-18 1.732e+00    0.122
>>
>> ---
>>
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>>
>>
>> Residual standard error: 4.962e-17 on 8 degrees of freedom
>>
>> Multiple R-squared: 0.7145,     Adjusted R-squared: 0.6788
>>
>> F-statistic: 20.02 on 1 and 8 DF,  p-value: 0.002071
>>
>>
>>
>>
>>       [[alternative HTML version deleted]]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Ehlers
> University of Calgary
> 403.202.3921
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org



More information about the R-help mailing list