[R] How to exclude insignificant intercepts using "step" function

Tue Jun 23 09:08:06 CEST 2009

I appreciate that you are trying to help me but I don't fully understand your
point. At one point I did say "... the intercept is not significantly
different from zero". I admit I also said "dropping the intercept term"
which in my loose application of terminology means force the intercept to a
value of zero. So yes the intercept exists and it has a value but that value
is not significantly different from zero. This does not make the intercept
non-significant or exclude an intercept in any way. If that was your point
then I stand corrected for my loose use of terminology. If not, then perhaps
you can expand a little more.

Perhaps the following will explain what I'm after. Fitting y ~ x1+x2 for
dataframe d1 gives the following:

> summary(lm(y~x1+x2, data=d1))

Call:
lm(formula = y ~ x1 + x2, data = d1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.165377 -0.034284  0.001215  0.033799  0.127428 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.02074    0.01823   1.137    0.258    
x1           0.99515    0.02122  46.891   <2e-16 ***
x2           0.97811    0.02240  43.656   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.05937 on 97 degrees of freedom
Multiple R-squared: 0.9717,     Adjusted R-squared: 0.9711 
F-statistic:  1665 on 2 and 97 DF,  p-value: < 2.2e-16 

>From my understanding I would be justified in considering the intercept to
have a value of zero. If I force a fit with zero intercept I get different
coefficients and summary stats as follows:

> summary(lm(y~0+x1+x2, data=d1))

Call:
lm(formula = y ~ 0 + x1 + x2, data = d1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.155509 -0.032272  0.004912  0.032568  0.130603 

Coefficients:
   Estimate Std. Error t value Pr(>|t|)    
x1  1.01297    0.01434   70.64   <2e-16 ***
x2  0.99715    0.01491   66.86   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.05946 on 98 degrees of freedom
Multiple R-squared: 0.997,      Adjusted R-squared: 0.9969 
F-statistic: 1.62e+04 on 2 and 98 DF,  p-value: < 2.2e-16 

For my real application theory would suggest the intercept is zero for each
of the thousands of groups in my dataset. Of course I can fit y ~ x1+x2 and
where the summary info suggests the intercept is not significantly different
from zero, refit y ~ -1+x1+x2. I just wondered whether step or some other
function could do that for me in one R expression. 

Thanks again.

David Winsemius wrote:
> 
> I think you should explain (to yourself primarily) what it means to  
> have a non-significant intercept. If you can justify on a theoretic  
> basis the exclusion of an intercept, then you may get more assistance.  
> However, if you are just naively questing after some mythical concept  
> of "significance", people may be less motivated to solve what most  
> would consider to be an "insignificant" question.
> 
> -- 
> DW
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 

-- 
View this message in context: http://www.nabble.com/How-to-exclude-insignificant-intercepts-using-%22step%22-function-tp24158818p24160969.html
Sent from the R help mailing list archive at Nabble.com.