[R-sig-ME] alternative interaction representations

Sun Aug 22 08:47:47 CEST 2010

# This representation fits two linear slopes, one below and one after
conc = 300, splicing them at 0:
CO2new$conc1 <- ifelse(CO2new$conc < 300, CO2new$conc - 300, 0)
CO2new$conc2 <- ifelse(CO2new$conc > 300, CO2new$conc - 300, 0)

# Basic LMM
print(LMM <- lmer(uptake ~ conc1 + conc2 + (1 | Plant), data=CO2new), cor=FALSE)
#  ... the linear uptake is significant below 300, no longer
significant after 300
# ... the intercept estimates the upake at conc=300

# To test whether there is significant between-plant variance in
slopes below and above conc:
# Varying-slopes LMM
print(LMM.conc.1 <-   lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc1 | Plant), data=CO2new), cor=FALSE)
print(LMM.conc.2 <-   lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc2 | Plant), data=CO2new), cor=FALSE)
#print(LMM.conc.1.2 <- lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc1 | Plant) + (0+conc2 | Plant), data=CO2new), cor=FALSE)
#print(LMM.conc.12 <-  lmer(uptake ~ conc1 + conc2 + (1 + conc1 +
conc2 | Plant), data=CO2new), cor=FALSE)

anova(LMM, LMM.conc.1)
anova(LMM, LMM.conc.2)
#  Apparently there is not enough information in the data to test the
between-slope variance.

Reinhold Kliegl

On Sat, Aug 21, 2010 at 7:07 PM, Sebastian P. Luque <spluque at gmail.com> wrote:
> Hi,
>
> ## With the CO2 data, suppose we want to build a LME model of 'uptake' with
> ## 'conc' (continuous) and want to know whether there is a change in slope
> ## at conc=300, with random slopes for plants
>
> CO2new <- subset(CO2, Type == "Quebec" & Treatment == "nonchilled")
> CO2new <- within(CO2new, {
>    ## The more intuitive way to set up the interaction is to first define
>    ## a factor breaking up the 'conc' predictor
>    stage1 <- cut(conc, breaks=c(floor(min(conc)), 300,
>                          ceiling(max(conc))),
>                  labels=c("pre", "post"), include.lowest=TRUE)
>    ## Alternative, direct representation of interaction
>    stage2 <- ifelse(conc > 300, conc - 300, 0)
>    ## We center conc at 300 for interpreting intercept here
>    conc <- conc - 300
> })
> str(CO2new)
> xyplot(uptake ~ conc, data=CO2new, groups=Plant, type="b")
>
> ## Consider a model with fixed effects for intercept, conc, and varying
> ## slopes.  Using the more intuitive representation:
>
> (fm1 <- lmer(uptake ~ conc + conc:stage1 + (conc:stage1 | Plant), data=CO2new))
>
> ## And using the direct representation of the interaction
>
> (fm2 <- lmer(uptake ~ conc + stage2 + (conc + stage2 | Plant), data=CO2new))
>
> ## In this simple case, it doesn't seem to matter which representation is
> ## used.  For other models where an interaction with another factor, say
> ## Type, is needed in the model to indicate 3-way interactions with conc
> ## then the latter seems to allow for a simpler model (which may impact
> ## lmer performance) because the interaction would then be modelled as a
> ## 2-way interaction.
> ##
> ## Is this a fair comparison of using direct representations of
> ## interactions compared to the more natural factor-based representations?
> ## Overall, is it preferable to use one rather than the other?
>
> Cheers,
>
> --
> Seb
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>