[R] Marginal (type II) SS for powers of continuous variables in a linear model?

Spencer Graves spencer.graves at pdf.com
Mon Aug 11 16:24:18 CEST 2003


I'm confused.  Consider the following example:

 > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9))
 > anova(lm(y~x, Df))
Analysis of Variance Table

Response: y
           Df    Sum Sq   Mean Sq   F value Pr(>F)
x          1 2.861e-34 2.861e-34 2.253e-34      1
Residuals  7    8.8889    1.2698
 > anova(lm(y~x+I(x^2), Df))
Analysis of Variance Table

Response: y
           Df    Sum Sq   Mean Sq   F value Pr(>F)
x          1 2.861e-34 2.861e-34 2.065e-34 1.0000
I(x^2)     1    0.5772    0.5772    0.4167 0.5425
Residuals  6    8.3117    1.3853
 >
 > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9))
 > anova(lm(y~x, Df))
Analysis of Variance Table

Response: y
           Df    Sum Sq   Mean Sq   F value Pr(>F)
x          1 2.861e-34 2.861e-34 2.253e-34      1
Residuals  7    8.8889    1.2698
 > anova(lm(y~x+I(x^2), Df))
Analysis of Variance Table

Response: y
           Df    Sum Sq   Mean Sq   F value Pr(>F)
x          1 2.861e-34 2.861e-34 2.065e-34 1.0000
I(x^2)     1    0.5772    0.5772    0.4167 0.5425
Residuals  6    8.3117    1.3853
 > anova(lm(y~I(x^2)+x, Df))
Analysis of Variance Table

Response: y
           Df Sum Sq Mean Sq F value Pr(>F)
I(x^2)     1 0.0282  0.0282  0.0203 0.8912
x          1 0.5490  0.5490  0.3963 0.5522
Residuals  6 8.3117  1.3853
 >
	  In S-Plus 6.1, the ANOVA table is preceeded by a statement, "Terms 
added sequentially (first to last)".  From these examples, it certainly 
looks like this is what it is doing.  Apart from round off error, the 
sum of squares and mean squares are identical for the models without and 
with I(x^2).  In an example with a nonzero sum of squares for x, the F 
value would be different, because the mean square for residuals would be 
different, and the Pr(>F) would also be affected by differing degrees of 
freedom.

	  The third example here puts I(x^2) before x in the model statement 
and gets a clearly different anova.  (The coefficients should be not 
change when the order of the terms is modified, though they could change 
if other terms are addeed.  I didn't check that for this example, but 
I've done this before and would be surprised if they were different.)

Best Wishes,
Spencer

Bjørn-Helge Mevik wrote:
> I've used Anova() from the car package to get marginal (aka type II)
> sum-of-squares and tests for linear models with categorical
> variables.  Is it possible to get marginal SSs also for continuous
> variables, when the model includes powers of the continuous variables?
> 
> For instance, if A and B are categorical ("factor"s) and x is
> continuous ("numeric"),
> 
> Anova (lm (y ~ A*B + x, ...))
> 
> will produce marginal SSs for all terms (A, B, A:B and x).  However,
> with 
> 
> Anova (lm (y ~ A*B + x + I(x^2), ...))
> 
> the SS for 'x' is calculated with I(x^2) present in the model, i.e. it
> is no longer marginal.
> 
> Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for
> the total effect of x, but not for the linear and quadratic effects
> separately.  (summary.aov() has a 'split' argument that can be used to
> get separate SSs, but these are not marginal.)
> 
>




More information about the R-help mailing list