[R] summary.lm() for zero variance response

Vito M. R. Muggeo vito.muggeo at unipa.it
Wed Mar 12 11:26:56 CET 2014


dear all,
a student of mine brought to my attention the following, somewhat odd, 
behaviour of summary.lm() when the response variance is zero (yes, 
possibly meaningless from a practical viewpoint). Namely something like

n=10;k=1;summary(lm(rep(k,n)~rnorm(n)))

The values of k, n and the covariate do not matter.

Two awkward points are
1) the F stat is different from t squared
2) more importantly, p-values from the F-stat are far smaller (and 
"significant" at usual levels 0.05/0.01) than the p-values coming from 
summary(..)$coef[,"Pr(>|t|)"] (i.e. the usual Wald test). Differences 
are dramatic for n>1000 where p(tstat)\approx0.8 and p(Fstat)< 2.2e-16.

I looked for "lm zero variance" or "lm deterministic data", or "lm zero 
residuals" but without success. Also ?lm does not include any warning 
about using it for zero variance data (as reported for instance in ?nls)

Am I missing anything?
thanks,
vito


-- 
==============================================
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

28th IWSM
International Workshop on Statistical Modelling
July 8-12, 2013, Palermo
http://iwsm2013.unipa.it



More information about the R-help mailing list