[R] about lm restrictions...

Liaw, Andy andy_liaw at merck.com
Fri Jan 27 04:22:06 CET 2006



From: klebyn
> 
> Hello all R-users
> 
> 
> _question 1_
> 
> I need to make a statistical model and respective ANOVA table
> but I get distinct results for
> 
> the T-test (in summary(lm.object) function) and
> the F-test (in   anova(lm.object) )
> 
> shouldn't this two approach give me the same result, i.e
> to indicate the same significants terms in both tests???????

No, because they are not the same tests.  The t-tests in summary.lm() test
whether the coefficient is zero, when all other terms are present in the
model.  The F-tests in anova.lm() test the terms by sequentially adding them
into the model.  Here's an example:

> set.seed(1)
> d <- data.frame(x1=runif(20), x2=runif(20), y=rnorm(20))
> fm <- lm(y ~ ., d)
> summary(fm)$coef
              Estimate Std. Error    t value   Pr(>|t|)
(Intercept)  1.0187254  0.5534310  1.8407452 0.08318123
x1          -1.6914784  0.6377065 -2.6524404 0.01675543
x2          -0.1817831  0.6618875 -0.2746435 0.78689983
> anova(fm)
Analysis of Variance Table

Response: y
          Df  Sum Sq Mean Sq F value  Pr(>F)  
x1         1  4.2341  4.2341  7.0936 0.01638 *
x2         1  0.0450  0.0450  0.0754 0.78690  
Residuals 17 10.1472  0.5969                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> anova(fm2 <- lm(y ~ x2 + x1, d))
Analysis of Variance Table

Response: y
          Df  Sum Sq Mean Sq F value  Pr(>F)  
x2         1  0.0797  0.0797  0.1336 0.71928  
x1         1  4.1994  4.1994  7.0354 0.01676 *
Residuals 17 10.1472  0.5969                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Notice how the p-value for x1 in the last output matches that of the t-test:
because both are testing if the coefficient for x1 is 0 given that x2 is
already in the model.  (It's the same reason that the p-value for x2 in the
first anova() output matches that of the summary.lm(), but not the second
anova() output.)

I may be off, but I do not think the restrictions you mentioned have any
bearing on the analysis.  If x + z is restricted to something _for each
case_ then you do have to worry, but not the way you have it.  You can
choose the independent variables to take on any value you like (as in
designed experiments), so such restrictions should not matter.

Andy


 
> obs.
> 
> The system has two restrictions:
> 1) sum( x_i ) = 1
> 2) sum( z_j ) = 1
> 
> 
> 
> *output below*
> 
> _question 2_
> 
> 
> Has I to considerate a SST in ANOVA table with:
> 
> 1) N-2 d.f. because of 2 restrictions?
>  or
> 2) N-1 d.f. because of 1 global restriction: sum( x ) + sum( z ) = 2 ?
> 
> 
> I don't find any paper, book or another reference,
> if someone may to indicate references for this type model (with 2 
> restrictions),
> I would be very grateful.
> 
> 
> Thanks a lot.
> Regards
>  
>  
> Cleber N. Borges
> 
> 
> 
> ###############################
> #         OUTPUT
> ###############################
> 
> 
> Coefficients: (1 not defined because of singularities)
>             Estimate Std. Error t value Pr(>|t|)   
> (Intercept)  15.5000     0.5270  29.409 2.97e-10 ***
> z1:x1        -5.0000     0.7454  -6.708 8.77e-05 ***
> z1:x2         0.5000     0.7454   0.671 0.519177   
> z1:x3        -3.0000     0.7454  -4.025 0.002996 **
> z2:x1        -6.0000     0.7454  -8.050 2.11e-05 ***
> z2:x2        -5.0000     0.7454  -6.708 8.77e-05 ***
> z2:x3        -4.5000     0.7454  -6.037 0.000193 ***
> z3:x1         1.0000     0.7454   1.342 0.212580   
> z3:x2         1.5000     0.7454   2.012 0.075029 . 
> z3:x3             NA         NA      NA       NA   
> 
> Analysis of Variance Table
> 
> Response: y
>           Df Sum Sq Mean Sq F value    Pr(>F)   
> z1:x1      1 16.674  16.674 30.0125 0.0003910 ***
> z1:x2      1 13.580  13.580 24.4446 0.0007977 ***
> z1:x3      1  1.190   1.190  2.1429 0.1772677   
> z2:x1      1 35.267  35.267 63.4800 2.287e-05 ***
> z2:x2      1 32.400  32.400 58.3200 3.202e-05 ***
> z2:x3      1 42.667  42.667 76.8000 1.061e-05 ***
> z3:x1      1  0.083   0.083  0.1500 0.7075349   
> z3:x2      1  2.250   2.250  4.0500 0.0750295 . 
> Residuals  9  5.000   0.556                     
> ---
> 
> 
> 
> 
> 
> ###############################
> #         DATA
> ###############################
> 
>   z1 z2 z3 x1 x2 x3  y
>   1  0  0  1  0  0 10
>   1  0  0  0  1  0 15
>   1  0  0  0  0  1 12
>   0  1  0  1  0  0 10
>   0  1  0  0  1  0 11
>   0  1  0  0  0  1 11
>   0  0  1  1  0  0 16
>   0  0  1  0  1  0 17
>   0  0  1  0  0  1 15
>   1  0  0  1  0  0 11
>   1  0  0  0  1  0 17
>   1  0  0  0  0  1 13
>   0  1  0  1  0  0  9
>   0  1  0  0  1  0 10
>   0  1  0  0  0  1 11
>   0  0  1  1  0  0 17
>   0  0  1  0  1  0 17
>   0  0  1  0  0  1 16
> 
> 
> 
> ###############################
> #         CODE
> ###############################
> 
> 
>  x = read.table(file("clipboard"),h=T)
> 
> ## NOT a Scheffé Model:
>  
>  x.lm <- lm( y ~ (z1+z2+z3):(x1+x2+x3), data=x)
>  summary(x.lm)
>  anova(x.lm)
> 
> 
> ## Scheffé Model: <- IS CORRECT the analysis below?
>  
>  x.lm <- lm( y ~ -1 + (z1+z2+z3):(x1+x2+x3), data=x)
>  summary(x.lm)
> 
>  x.aov <- aov( y ~  (z1+z2+z3):(x1+x2+x3), data=x)
>  summary(x.aov)
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list