[R] summary.lm() for zero variance response
Andrews, Chris
chrisaa at med.umich.edu
Wed Mar 12 13:51:22 CET 2014
I'm on 64-bit vs your 32-bit. And if you haven't received this from other R-helpers already, here it is: FAQ 7.31. Machine precision is producing numbers very close to zero but not zero. Then division is practically a random number generator. Also, I'm certain that t and F are computed separately (i.e., not by computing t and then squaring) so that the relationship t^2 = F fails again due to the machine precision limitation in the intermediate calculations.
-----Original Message-----
From: Vito M. R. Muggeo [mailto:vito.muggeo at unipa.it]
Sent: Wednesday, March 12, 2014 8:37 AM
To: Andrews, Chris; r-help at r-project.org
Subject: Re: [R] summary.lm() for zero variance response
Hi Chris,
Here my output (I have not yet installed R 3.0.3)
> n=10;k=1;summary(lm(rep(k,n)~rnorm(n)))
Call:
lm(formula = rep(k, n) ~ rnorm(n))
Residuals:
Min 1Q Median 3Q Max
-1.465e-16 1.564e-18 1.764e-17 2.147e-17 3.492e-17
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000e+00 2.021e-17 4.949e+16 <2e-16 ***
rnorm(n) -1.620e-17 2.236e-17 -7.240e-01 0.489
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.637e-17 on 8 degrees of freedom
Multiple R-squared: 0.6598, Adjusted R-squared: 0.6173
F-statistic: 15.52 on 1 and 8 DF, p-value: 0.004301
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)
Il 12/03/2014 13.25, Andrews, Chris ha scritto:
> I get what I would expect. The tstat and the Fstat are both undefined (0/0); as are the p-values
>
>> n=10;k=1;summary(lm(rep(k,n)~rnorm(n)))
>
> Call:
> lm(formula = rep(k, n) ~ rnorm(n))
>
> Residuals:
> Min 1Q Median 3Q Max
> 0 0 0 0 0
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1 0 Inf <2e-16 ***
> rnorm(n) 0 0 NA NA
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 0 on 8 degrees of freedom
> Multiple R-squared: NaN, Adjusted R-squared: NaN
> F-statistic: NaN on 1 and 8 DF, p-value: NA
>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>
> -----Original Message-----
> From: Vito M. R. Muggeo [mailto:vito.muggeo at unipa.it]
> Sent: Wednesday, March 12, 2014 6:27 AM
> To: r-help at r-project.org
> Subject: [R] summary.lm() for zero variance response
>
> dear all,
> a student of mine brought to my attention the following, somewhat odd,
> behaviour of summary.lm() when the response variance is zero (yes,
> possibly meaningless from a practical viewpoint). Namely something like
>
> n=10;k=1;summary(lm(rep(k,n)~rnorm(n)))
>
> The values of k, n and the covariate do not matter.
>
> Two awkward points are
> 1) the F stat is different from t squared
> 2) more importantly, p-values from the F-stat are far smaller (and
> "significant" at usual levels 0.05/0.01) than the p-values coming from
> summary(..)$coef[,"Pr(>|t|)"] (i.e. the usual Wald test). Differences
> are dramatic for n>1000 where p(tstat)\approx0.8 and p(Fstat)< 2.2e-16.
>
> I looked for "lm zero variance" or "lm deterministic data", or "lm zero
> residuals" but without success. Also ?lm does not include any warning
> about using it for zero variance data (as reported for instance in ?nls)
>
> Am I missing anything?
> thanks,
> vito
>
>
--
==============================================
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo
28th IWSM
International Workshop on Statistical Modelling
July 8-12, 2013, Palermo
http://iwsm2013.unipa.it
===============================================
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the R-help
mailing list