[R-sig-ME] gamm4 model formulation clarification

Sun Aug 1 02:42:14 CEST 2010

Thanks, though I'm not clear on how assigning contrasts to g then
adding it as a fixed effect is different from the original
specification, where the smooth is told to vary by g but g isn't in
the fixed effects; instead I had ddB*design in the fixed effects,
which I thought would be equivalent to having g and the contrasts you
suggested. However, it seems that I get different results between the
two approaches (see below). Any ideas why? I presume I'm missing
something!

Also, I'm just now thinking that neither approach really gets at what
I want to test; what I want to test whether the model is improved by
(1) letting the smooth vary as a function of design, (2) letting the
smooth vary as a function of ddB, and (3) letting the smooth vary as a
function of both design & ddB. That is, is there a design*soa
interaction, a ddB*soa interaction and/or a design*ddB*soa
interaction? In the case of support for any of these interactions, I'd
also be interested in pinpointing the timeframe over which the
interactions take place.

Thoughts?

#here is the "ddB*design+s(soa,by=g)" versus "g+s(soa,by=g)" fits and
output, showing that they're not identical:

>fit1 = gamm4(
>	data = ab
>	, formula = rt ~ ddB*design+s(soa,by=g)
>	, random = ~ (1|id)
>)
>
> print(fit1$mer,corr=F)
Linear mixed model fit by REML
     AIC     BIC logLik deviance REMLdev
 -564383 -564261 282205  -564567 -564411
Random effects:
 Groups   Name              Variance   Std.Dev.
 id       (Intercept)       6.9251e-08 0.00026316
 Xr.4     s(soa):g0ddB NCD 6.1200e-05 0.00782305
 Xr.3     s(soa):g0ddB CD  1.3823e-03 0.03717992
 Xr.2     s(soa):g+ddB NCD 2.4589e-04 0.01568096
 Xr.1     s(soa):g+ddB CD  2.0301e-03 0.04505718
 Residual                   2.0814e-07 0.00045623
Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8

Fixed effects:
                        Estimate Std. Error t value
X(Intercept)           2.455e-03  5.896e-05   41.63
XddB+ddB               5.104e-05  5.232e-06    9.76
XdesignNCD            -2.665e-04  7.066e-06  -37.72
XddB+ddB:designNCD    -4.450e-05  1.007e-05   -4.42
Xs(soa):g+ddB CDFx1  -1.592e-04  5.178e-05   -3.07
Xs(soa):g+ddB NCDFx1  1.099e-04  6.717e-05    1.64
Xs(soa):g0ddB CDFx1   9.027e-06  4.942e-05    0.18
Xs(soa):g0ddB NCDFx1  3.618e-05  4.430e-05    0.82
>
>
> summary(fit1$gam)

Family: gaussian
Link function: identity

Formula:
rrt ~ ddB * design + s(soa, by = g)

Parametric coefficients:
                    Estimate Std. Error t value Pr(>|t|)
(Intercept)        2.455e-03  3.706e-06 662.327  < 2e-16 ***
ddB+ddB            5.104e-05  5.230e-06   9.760  < 2e-16 ***
designNCD         -2.665e-04  7.260e-06 -36.710  < 2e-16 ***
ddB+ddB:designNCD -4.450e-05  1.005e-05  -4.428 9.55e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                    edf Ref.df       F  p-value
s(soa):g+ddB CD  8.042  8.042 205.327  < 2e-16 ***
s(soa):g+ddB NCD 5.419  5.419   4.375 0.000361 ***
s(soa):g0ddB CD  7.795  7.795 213.477  < 2e-16 ***
s(soa):g0ddB NCD 3.977  3.977   4.461 0.001358 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.0964lmer.REML score = -5.6441e+05  Scale est. =
2.0814e-07  n = 45013

>fit2 = gamm4(
>	data = ab
>	, formula = rt ~ g+s(soa,by=g)
>	, random = ~ (1|id)
>)
>
> print(fit2$mer,corr=F)
Linear mixed model fit by REML
     AIC     BIC logLik deviance REMLdev
 -564381 -564259 282205  -564567 -564409
Random effects:
 Groups   Name              Variance   Std.Dev.
 id       (Intercept)       6.9252e-08 0.00026316
 Xr.4     s(Ssoa):g0ddB NCD 6.1196e-05 0.00782278
 Xr.3     s(Ssoa):g0ddB CD  1.3824e-03 0.03718002
 Xr.2     s(Ssoa):g+ddB NCD 2.4589e-04 0.01568093
 Xr.1     s(Ssoa):g+ddB CD  2.0302e-03 0.04505719
 Residual                   2.0814e-07 0.00045623
Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8

Fixed effects:
                        Estimate Std. Error t value
X(Intercept)           2.336e-03  5.890e-05   39.66
Xg.34-12              -2.879e-05  5.035e-06   -5.72
Xg.24-13              -2.887e-04  5.032e-06  -57.38
Xg.14-23               2.225e-05  5.035e-06    4.42
Xs(soa):g+ddB CDFx1  -1.592e-04  5.178e-05   -3.07
Xs(soa):g+ddB NCDFx1  1.099e-04  6.717e-05    1.64
Xs(soa):g0ddB CDFx1   9.027e-06  4.942e-05    0.18
Xs(soa):g0ddB NCDFx1  3.618e-05  4.430e-05    0.82
>
>
> summary(fit2$gam)

Family: gaussian
Link function: identity

Formula:
rrt ~ g + s(soa, by = g)

Parametric coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  2.336e-03  2.637e-06 885.684  < 2e-16 ***
g.34-12     -2.879e-05  5.025e-06  -5.728 1.02e-08 ***
g.24-13     -2.887e-04  5.275e-06 -54.741  < 2e-16 ***
g.14-23      2.225e-05  5.025e-06   4.428 9.55e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                    edf Ref.df       F  p-value
s(soa):g+ddB CD  8.042  8.042 205.328  < 2e-16 ***
s(soa):g+ddB NCD 5.419  5.419   4.375 0.000361 ***
s(soa):g0ddB CD  7.795  7.795 213.477  < 2e-16 ***
s(soa):g0ddB NCD 3.977  3.977   4.462 0.001357 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.0964lmer.REML score = -5.6441e+05  Scale est. =
2.0814e-07  n = 45013

On Sat, Jul 31, 2010 at 6:47 PM, Reinhold Kliegl
<reinhold.kliegl at gmail.com> wrote:
> Sorry, the following line should be:
> constrasts(ab$g) <- cmat
>
> On Sat, Jul 31, 2010 at 11:30 PM, Reinhold Kliegl
> <reinhold.kliegl at gmail.com> wrote:
>> Perhaps assign contrasts to g. For example,
>> # ... ... contrast estimates
>> cmat <- matrix(c(  -1/2, -1/2, +1/2, +1/2,   # Main effect 1
>>                              -1/2, +1/2, -1/2, +1/2,   # Main effect 2
>>                              +1/2, -1/2, -1/2, +1/2),  4,  3,    # Interaction
>>                               dimnames=list(c("A1", "A2", "A3", "A4"),
>>                                c(".34-12", ".24-13", ".14-23")))
>>
>> constrasts(g) <- cmat
>>  fit = gamm4(
>>      data = ab
>>      , formula = rrt ~ g+s(soa,by=g)
>>     , random = ~ (1|id)
>>     , family = 'gaussian'
>>   )
>>
>> Reinhold Kliegl
>>
>> On Sat, Jul 31, 2010 at 6:10 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
>>> Hi folks,
>>>
>>> I have some data that form a 2x2x15 design, where the 15 level
>>> variable is a discrete sampling of a ratio variable with clear
>>> non-linearities (see bottom for a dput() of the means). I came across
>>> gamm4 tonight and it looks like it will help tackle this data, but I'm
>>> not sure how to tell it to let the smooth of the 15 level variable
>>> vary as a function of BOTH of the other predictor variables. As a
>>> hack, I created a dummy 4 level variable that represented the
>>> combination of the 2x2 level variables, but I'm not positive that this
>>> was the right thing to do. Any feedback would be greatly appreciated.
>>> Here's how I had things set up:
>>>
>>>> str(ab)
>>> 'data.frame':   49668 obs. of  5 variables:
>>>  $ id       : Factor w/ 20 levels "5","6","7","8",..: 1 1 1 1 1 1 1 1 1 1 ...
>>>  $ design   : Factor w/ 2 levels "CD","NCD": 1 1 1 1 1 1 1 1 1 1 ...
>>>  $ ddB      : Factor w/ 2 levels "0ddB","+ddB": 1 1 1 1 1 1 1 1 1 1 ...
>>>  $ soa      : num  300 300 300 300 300 300 300 300 300 300 ...
>>>  $ rt       : num  441 373 440 290 221 ...
>>>  $ g        : Factor w/ 4 levels "+ddB CD","+ddB NCD",..: 3 3 3 3 3 3
>>> 3 3 3 3 ...
>>>
>>> #id is the random effect (human participants)
>>> #design and ddB are 2-level fixed effects
>>> #soa is a 15 level fixed effect
>>> #rt is the data to predict (there are dozens of observations for each
>>> combination of the random and fixed effects)
>>>
>>> #fit using dummy variable "g" to get different soa smooths per
>>> combination of ddB and design
>>> fit = gamm4(
>>>        data = ab
>>>        , formula = rrt ~ ddB*design+s(soa,by=g)
>>>        , random = ~ (1|id)
>>>        , family = 'gaussian'
>>> )
>>>
>>>
>>> #here's dput() ouput of the 2x2x15 means, revealing that soa is
>>> clearly non-linear
>>>> dput(means)
>>> structure(list(ddB = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L), .Label = c("0ddB", "+ddB"), class = "factor"),
>>>    design = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
>>>    1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L,
>>>    1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
>>>    2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
>>>    2L, 2L, 1L, 1L, 2L, 2L), .Label = c("CD", "NCD"), class = "factor"),
>>>    soa = c(-175, -175, -175, -175, -125, -125, -125, -125, -75,
>>>    -75, -75, -75, -25, -25, -25, -25, 25, 25, 25, 25, 75, 75,
>>>    75, 75, 125, 125, 125, 125, 175, 175, 175, 175, 250, 250,
>>>    250, 250, 350, 350, 350, 350, 450, 450, 450, 450, 550, 550,
>>>    550, 550, 700, 700, 700, 700, 900, 900, 900, 900, 1100, 1100,
>>>    1100, 1100), rt = structure(c(444.93273739708, 441.513208123373,
>>>    471.546335977687, 472.283526755609, 444.056771928409, 442.461563636396,
>>>    472.166623352385, 474.462330421383, 445.513188139913, 444.966221088285,
>>>    475.669898472348, 461.315736508479, 442.982086750377, 435.355068015326,
>>>    458.89264807265, 455.638816561315, 434.340063293089, 415.709247937572,
>>>    450.359216604288, 438.114012378908, 420.982354512772, 404.811090521404,
>>>    442.721222728774, 430.421981079446, 406.373845346035, 393.411137886923,
>>>    439.687373941982, 427.611295261772, 398.185439780545, 384.007719475387,
>>>    429.451304040456, 431.531830923294, 390.486982286177, 380.880066145206,
>>>    431.287499227013, 435.926925139256, 382.937376664574, 382.020452210116,
>>>    439.606939778423, 441.751714989440, 384.546631449636, 385.477927260379,
>>>    440.7286749068, 442.790235121405, 387.589278670798, 386.68589658312,
>>>    440.384534575679, 440.376618739428, 392.649929780933, 392.496542853778,
>>>    442.673509587221, 446.148838198613, 401.042352162140, 394.475542684099,
>>>    441.072702415870, 440.157897723193, 406.059905100442, 398.292572833949,
>>>    443.899793077701, 445.715779415161), .Dim = c(60L, 1L), .Dimnames = list(
>>>        NULL, NULL))), .Names = c("ddB", "design", "soa", "rt"
>>> ), row.names = c(NA, 60L), class = "data.frame")
>>>
>>> #visualise the means
>>> library(ggplot2)
>>> ggplot(
>>>        data = means
>>>        , mapping = aes(
>>>                x = soa
>>>                , y = rt
>>>                , colour = design
>>>                , linetype = ddB
>>>        )
>>> )+
>>> geom_line()+
>>> geom_point(shape=21,fill='white')
>>>
>>> --
>>> Mike Lawrence
>>> Graduate Student
>>> Department of Psychology
>>> Dalhousie University
>>>
>>> Looking to arrange a meeting? Check my public calendar:
>>> http://tr.im/mikes_public_calendar
>>>
>>> ~ Certainty is folly... I think. ~
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~