[R] meaning of tests presented in anova(ols(...)) {Design package}

Tue Jul 15 06:34:33 CEST 2008

Hi,

I am curious about how to interpret the table produced by
anova(ols(...)), from the Design package. I have a multiple linear
regression model, with some interaction, defined by:

ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
    3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
    y = TRUE)

         n Model L.R.       d.f.         R2      Sigma
      1834       1203         14       0.48        1.2

Residuals:
   Min     1Q Median     3Q    Max
-5.033 -0.859  0.016  0.739  4.868

Coefficients:
                       Value Std. Error     t        Pr(>|t|)
Intercept         11.3886790  2.0220171  5.63 0.0000000205580
sar               -4.3991263  1.0157588 -4.33 0.0000156609226
activity         -40.0591221  5.6907822 -7.04 0.0000000000027
activity^2        33.0570116  5.0578520  6.54 0.0000000000819
activity^3        -8.1645147  1.3750370 -5.94 0.0000000034548
conc               0.3841260  0.0813200  4.72 0.0000024942478
sand              -0.0096212  0.0327415 -0.29 0.7689032898947
sand^2             0.0008495  0.0008589  0.99 0.3227487169683
sand^3             0.0000025  0.0000066  0.39 0.6994987342042
sar * activity    12.8134698  2.9513942  4.34 0.0000149300007
sar * activity^2  -9.9981381  2.6310765 -3.80 0.0001494462966
sar * activity^3   2.1481278  0.7168339  3.00 0.0027662261037
conc * sand       -0.0157426  0.0076013 -2.07 0.0384966958735
conc * sand^2      0.0003419  0.0001989  1.72 0.0857381555491
conc * sand^3     -0.0000027  0.0000015 -1.77 0.0777025949762

Looking at what I 'think' are "marginal p-values" i.e. results of a
test against coef_i != 0, there are several terms with non-significant
coefficients (at p<0.05). Does a non-significant coefficient warrant
removal from the model, or perhaps a mention in the discussion?

Compared to the above example, what tests are performed when calling
anova() on this object? Here is the output in R:

               Analysis of Variance          Response: log(ksat * 60 * 60)

 Factor                                        d.f. Partial SS MS     F
 sar  (Factor+Higher Order Factors)               4  168.43     42.11  27.0
  All Interactions                                3  142.13     47.38  30.4
 activity  (Factor+Higher Order Factors)          6  536.84     89.47  57.3
  All Interactions                                3  142.13     47.38  30.4
  Nonlinear (Factor+Higher Order Factors)         4  257.25     64.31  41.2
 conc  (Factor+Higher Order Factors)              4  443.02    110.75  71.0
  All Interactions                                3   76.74     25.58  16.4
 sand  (Factor+Higher Order Factors)              6 1906.29    317.71 203.6
  All Interactions                                3   76.74     25.58  16.4
  Nonlinear (Factor+Higher Order Factors)         4  263.00     65.75  42.1
 sar * activity  (Factor+Higher Order Factors)    3  142.13     47.38  30.4
  Nonlinear                                       2   95.32     47.66  30.5
  Nonlinear Interaction : f(A,B) vs. AB           2   95.32     47.66  30.5
 conc * sand  (Factor+Higher Order Factors)       3   76.74     25.58  16.4
  Nonlinear                                       2    4.98      2.49   1.6
  Nonlinear Interaction : f(A,B) vs. AB           2    4.98      2.49   1.6
 TOTAL NONLINEAR                                  8  455.20     56.90  36.5
 TOTAL INTERACTION                                6  218.87     36.48  23.4
 TOTAL NONLINEAR + INTERACTION                   10  573.36     57.34  36.7
 REGRESSION                                      14 2631.53    187.97 120.4
 ERROR                                         1819 2839.25      1.56
 P
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 0.203
 0.203
 <.0001
 <.0001
 <.0001
 <.0001

Are more of the 'terms' significant (at p<0.05) due to pooling of
model terms? I have looked through Frank's book on the topic, but
can't quite wrap my head around what the above is telling me. I am
mostly interested in presenting a model for use as a applied tool, and
interpretation of terms / interaction is very important.

Thanks,

Dylan