[R] decide between polynomial vs ordered factor model (lme)

Mon Jan 9 16:59:24 CET 2006

On 1/9/06, Leo Gürtler <leog at anicca-vijja.de> wrote:
> Dear alltogether,
>
> two lme's, the data are available at:
>
> http://www.anicca-vijja.de/lg/hlm3_nachw.Rdata
>
> explanations of the data:
>
> nachw = post hox knowledge tests over 6 measure time points (= equally
> spaced)
> zeitn = time points (n = 6)
> subgr = small learning groups (n = 28)
> gru = 4 different groups = treatment factor
>
> levels: time (=zeitn) (n=6) within subject (n=4) within smallgroups
> (=gru) (n = 28), i.e. n = 4 * 28 = 112 persons and 112 * 6 = 672 data points
>
> library(nlme)
> fitlme7 <- lme(nachw ~ I(zeitn-3.5) + I((zeitn-3.5)^2) +
> I((zeitn-3.5)^3) + I((zeitn-3.5)^4)*gru, random = list(subgr = ~ 1,
> subject = ~ zeitn), data = hlm3)
>
> fit5 <- lme(nachw ~ ordered(I(zeitn-3.5))*gru, random = list(subgr =
> ~ 1, subject = ~ zeitn), data = hlm3)
>
> anova( update(fit5, method="ML"), update(fitlme7, method="ML") )
>
>  > anova( update(fit5, method="ML"), update(fitlme7, method="ML") )
>                                 Model df      AIC      BIC    logLik   Test
> update(fit5, method = "ML")        1 29 2535.821 2666.619 -1238.911
> update(fitlme7, method = "ML")     2 16 2529.719 2601.883 -1248.860 1 vs 2
>                                  L.Ratio p-value
> update(fit5, method = "ML")
> update(fitlme7, method = "ML") 19.89766  0.0978
>  >
>
> shows that both are ~ equal, although I know about the uncertainty of ML
> tests with lme(). Both models show that the ^2 and the ^4 terms are
> important parts of the model.
>
> My question is:
>
> - Is it legitimate to choose a model based on these outputs according to
> theoretical considerations instead of statistical tests that not really
> show a superiority of one model over the other one?
>
> - Is there another criterium I've overlooked to decide which model can be
> clearly preferred?
>
> - The idea behind that is that in the one model (fit5) the second
> contrast of the factor (gru) is statistically significant, although not
> the whole factor in the anova output.
> In the other model, this is not the case.
> Theoretically interesting is of course the significance of the second
> contrast of gru, as it shows a tendency of one treatment being slightly
> superior. I want to choose this model but I am not sure whether this is
> proper action. Both models shows this trend, but only one model clearly
> indicates that this trend bears some empirical meaning.
>
> Thanks for any suggestions,

The comparisons may be more clearly shown if you create the ordered
factor and a second version of the ordered factor what has the
contrasts set so it produces a 4th order polynomial.  That is, set

hlm3$ozeit <- ordered(hlm3$zeitn)
hlm3$ozeit4 <- C(hlm3$ozeit, contr.poly, 4)

then define one model in terms of ozeit and a second model in terms of ozeit4.

I would go further and create a new binary factor from gru that
contrasted level 2 against the other three levels and use that instead
of gru.

For a model fit by lme I would use the one-argument form of anova to
assess the significance of terms in the fixed effects.  (That advice
doesn't hold for models fit by lmer - at least at present.)