[R] Error in predict.lm() for models without intercept?

isabella at ghement.ca isabella at ghement.ca
Wed Sep 17 05:58:50 CEST 2014


 

	Hi Rolf,  

	  BODY { font-family:Arial, Helvetica, sans-serif;font-size:12px; }
Thanks very much for your response.   You are right - my simulated
example works as intended, so it can't be used to get to the bottom of
this problem (if it is a problem at all).   

	Here is another example, which is the one I actually worked with when
I thought maybe something is not quite right in the universe.   

	The example is based on a real data set (please keep it
confidential), which is attached to this e-mail as mod.data.csv.  
This data set includes terminal run numbers for salmon, recorded at
Age_5, Age_4, Age_3 and Age_2.  A model of the form lm(Age_5 ~ 0 +
Age_4 + Age_3 + Age_2) is fitted to these data and the goal is to
visualize the effects of Age_4, Age_3 and Age_2 on Age_5.  For
biological reasons, this model is supposed to not have an intercept.  


	The attachment Effect_1.pdf shows what these effects look like.  If
the model has no intercept, should the confidence bands still flare up
as one moves away from the value of the predictor whose effect we care
about?   

	The attachment Effect_2.pdf replicates the effects plots but this
time using the effects package.   

	If predict() is correct, should we expect from statistical theory to
see that the confidence bands have this particular behaviour? 
Intuitively, I would have expected them to look like a fan plot that
starts out at zero and then flares up as we move away from zero. 

	Here is the R code I used to create the two attached plots (with R
x64 3.1.0).  In this code, Age_5 becomes y, Age_4 becomes x, Age_3
becomes z and Age_2 becomes v.   

	## read mod.data into R 

	mod.data  single predictor and no intercept using R code:
 >
 > ## simulate a response y and two predictors x and z
 >
 > x 
 > z 
 > y 
 >
 > ## fit a linear model with no intercept but with one predictor
 >
 > mod 
 > ## compute confidence bands (i.e., fitted values +/- 1.96 standard
errors of fitted values)
 >
 > conf.band.x  interval="confidence")
 >
 > ## display confidence bands
 >
 > conf.band.x  fit=conf.band.x[,"fit"],
 > upr=conf.band.x[,"upr"])
 >
 > matplot(x=seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01),
y=conf.band.x, type="l", xlab="x", ylab="y")
 > abline(v=mean(x),lty=3,col="magenta")
 > title("Effect of x on y")
 >
 > According to statistical theory, in a model with no intercept and
one predictor, the standard errors should be directly
 > proportional to the value of x at which they are evaluated. If x=0,
the standard errors should also be zero. If x increases,
 > the standard errors should also increase. The resulting plot
produced by matplot shows this is not the case - the standard
 > errors appear to increase as one moves away from the average value
of x. We would expect this behaviour if the model included
 > an intercept, which is not the case here.
 >
 > Here is some R code for looking at standard errors of fitted values
when the model includes no intercept and two predictors x
 > and z. In this code, the value of the predictor z is set to its
average level.
 > ## linear model with no intercept but with two predictors
 >
 > mod 
 > conf.band.x  z = mean(z)),
 > interval="confidence")
 >
 > conf.band.x  fit=conf.band.x[,"fit"],
 > upr=conf.band.x[,"upr"])
 >
 > matplot(x=seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01),
y=conf.band.x, type="l", xlab="x", ylab="y")
 > abline(v=mean(x),lty=3,col="magenta")
 > title("Partial Effect of x on y (obtained by setting z to its
average level)")
 >
 > Again, the standard errors seem to behave as though they would come
from a model with an intercept (given that they are
 > flaring up as one moves away from the average value of the
predictor x).
 >
 > I would very much appreciate any clarifications or suggestions for
how to fix this problem.
 >
 > If the problem is confirmed, it appears to also carry over to the
effects package in R, which constructs plots similar to the
 > ones produced by matplot above by relying on the predict()
function.
 -- 
 Rolf Turner
 Technical Editor ANZJS
 


More information about the R-help mailing list