[R-sig-ME] gamm4 model formulation clarification

Sun Aug 1 08:48:27 CEST 2010

On the difference between the two specifications: Perhaps you have
treatment contrasts for design and soa. Then the fixed effect
coefficients will not estimate the main effect and the interaction.
You would need to specify  "contrasts(design) <- contr.sum(2)/2, etc."
for them if you specify the contrast matrix as a I suggested (which
returns estimates of the difference between levels, not as a
difference from the grand mean as the default).

Reinhold Kliegl

On Sun, Aug 1, 2010 at 2:42 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
> Thanks, though I'm not clear on how assigning contrasts to g then
> adding it as a fixed effect is different from the original
> specification, where the smooth is told to vary by g but g isn't in
> the fixed effects; instead I had ddB*design in the fixed effects,
> which I thought would be equivalent to having g and the contrasts you
> suggested. However, it seems that I get different results between the
> two approaches (see below). Any ideas why? I presume I'm missing
> something!
>
> Also, I'm just now thinking that neither approach really gets at what
> I want to test; what I want to test whether the model is improved by
> (1) letting the smooth vary as a function of design, (2) letting the
> smooth vary as a function of ddB, and (3) letting the smooth vary as a
> function of both design & ddB. That is, is there a design*soa
> interaction, a ddB*soa interaction and/or a design*ddB*soa
> interaction? In the case of support for any of these interactions, I'd
> also be interested in pinpointing the timeframe over which the
> interactions take place.
>
> Thoughts?
>
> #here is the "ddB*design+s(soa,by=g)" versus "g+s(soa,by=g)" fits and
> output, showing that they're not identical:
>
>>fit1 = gamm4(
>>       data = ab
>>       , formula = rt ~ ddB*design+s(soa,by=g)
>>       , random = ~ (1|id)
>>)
>>
>> print(fit1$mer,corr=F)
> Linear mixed model fit by REML
>     AIC     BIC logLik deviance REMLdev
>  -564383 -564261 282205  -564567 -564411
> Random effects:
>  Groups   Name              Variance   Std.Dev.
>  id       (Intercept)       6.9251e-08 0.00026316
>  Xr.4     s(soa):g0ddB NCD 6.1200e-05 0.00782305
>  Xr.3     s(soa):g0ddB CD  1.3823e-03 0.03717992
>  Xr.2     s(soa):g+ddB NCD 2.4589e-04 0.01568096
>  Xr.1     s(soa):g+ddB CD  2.0301e-03 0.04505718
>  Residual                   2.0814e-07 0.00045623
> Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
>
> Fixed effects:
>                        Estimate Std. Error t value
> X(Intercept)           2.455e-03  5.896e-05   41.63
> XddB+ddB               5.104e-05  5.232e-06    9.76
> XdesignNCD            -2.665e-04  7.066e-06  -37.72
> XddB+ddB:designNCD    -4.450e-05  1.007e-05   -4.42
> Xs(soa):g+ddB CDFx1  -1.592e-04  5.178e-05   -3.07
> Xs(soa):g+ddB NCDFx1  1.099e-04  6.717e-05    1.64
> Xs(soa):g0ddB CDFx1   9.027e-06  4.942e-05    0.18
> Xs(soa):g0ddB NCDFx1  3.618e-05  4.430e-05    0.82
>>
>>
>> summary(fit1$gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> rrt ~ ddB * design + s(soa, by = g)
>
> Parametric coefficients:
>                    Estimate Std. Error t value Pr(>|t|)
> (Intercept)        2.455e-03  3.706e-06 662.327  < 2e-16 ***
> ddB+ddB            5.104e-05  5.230e-06   9.760  < 2e-16 ***
> designNCD         -2.665e-04  7.260e-06 -36.710  < 2e-16 ***
> ddB+ddB:designNCD -4.450e-05  1.005e-05  -4.428 9.55e-06 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Approximate significance of smooth terms:
>                    edf Ref.df       F  p-value
> s(soa):g+ddB CD  8.042  8.042 205.327  < 2e-16 ***
> s(soa):g+ddB NCD 5.419  5.419   4.375 0.000361 ***
> s(soa):g0ddB CD  7.795  7.795 213.477  < 2e-16 ***
> s(soa):g0ddB NCD 3.977  3.977   4.461 0.001358 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> R-sq.(adj) =  0.0964lmer.REML score = -5.6441e+05  Scale est. =
> 2.0814e-07  n = 45013
>
>>fit2 = gamm4(
>>       data = ab
>>       , formula = rt ~ g+s(soa,by=g)
>>       , random = ~ (1|id)
>>)
>>
>> print(fit2$mer,corr=F)
> Linear mixed model fit by REML
>     AIC     BIC logLik deviance REMLdev
>  -564381 -564259 282205  -564567 -564409
> Random effects:
>  Groups   Name              Variance   Std.Dev.
>  id       (Intercept)       6.9252e-08 0.00026316
>  Xr.4     s(Ssoa):g0ddB NCD 6.1196e-05 0.00782278
>  Xr.3     s(Ssoa):g0ddB CD  1.3824e-03 0.03718002
>  Xr.2     s(Ssoa):g+ddB NCD 2.4589e-04 0.01568093
>  Xr.1     s(Ssoa):g+ddB CD  2.0302e-03 0.04505719
>  Residual                   2.0814e-07 0.00045623
> Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
>
> Fixed effects:
>                        Estimate Std. Error t value
> X(Intercept)           2.336e-03  5.890e-05   39.66
> Xg.34-12              -2.879e-05  5.035e-06   -5.72
> Xg.24-13              -2.887e-04  5.032e-06  -57.38
> Xg.14-23               2.225e-05  5.035e-06    4.42
> Xs(soa):g+ddB CDFx1  -1.592e-04  5.178e-05   -3.07
> Xs(soa):g+ddB NCDFx1  1.099e-04  6.717e-05    1.64
> Xs(soa):g0ddB CDFx1   9.027e-06  4.942e-05    0.18
> Xs(soa):g0ddB NCDFx1  3.618e-05  4.430e-05    0.82
>>
>>
>> summary(fit2$gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> rrt ~ g + s(soa, by = g)
>
> Parametric coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)  2.336e-03  2.637e-06 885.684  < 2e-16 ***
> g.34-12     -2.879e-05  5.025e-06  -5.728 1.02e-08 ***
> g.24-13     -2.887e-04  5.275e-06 -54.741  < 2e-16 ***
> g.14-23      2.225e-05  5.025e-06   4.428 9.55e-06 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Approximate significance of smooth terms:
>                    edf Ref.df       F  p-value
> s(soa):g+ddB CD  8.042  8.042 205.328  < 2e-16 ***
> s(soa):g+ddB NCD 5.419  5.419   4.375 0.000361 ***
> s(soa):g0ddB CD  7.795  7.795 213.477  < 2e-16 ***
> s(soa):g0ddB NCD 3.977  3.977   4.462 0.001357 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> R-sq.(adj) =  0.0964lmer.REML score = -5.6441e+05  Scale est. =
> 2.0814e-07  n = 45013
>
>
>
> On Sat, Jul 31, 2010 at 6:47 PM, Reinhold Kliegl
> <reinhold.kliegl at gmail.com> wrote:
>> Sorry, the following line should be:
>> constrasts(ab$g) <- cmat
>>
>> On Sat, Jul 31, 2010 at 11:30 PM, Reinhold Kliegl
>> <reinhold.kliegl at gmail.com> wrote:
>>> Perhaps assign contrasts to g. For example,
>>> # ... ... contrast estimates
>>> cmat <- matrix(c(  -1/2, -1/2, +1/2, +1/2,   # Main effect 1
>>>                              -1/2, +1/2, -1/2, +1/2,   # Main effect 2
>>>                              +1/2, -1/2, -1/2, +1/2),  4,  3,    # Interaction
>>>                               dimnames=list(c("A1", "A2", "A3", "A4"),
>>>                                c(".34-12", ".24-13", ".14-23")))
>>>
>>> constrasts(g) <- cmat
>>>  fit = gamm4(
>>>      data = ab
>>>      , formula = rrt ~ g+s(soa,by=g)
>>>     , random = ~ (1|id)
>>>     , family = 'gaussian'
>>>   )
>>>
>>> Reinhold Kliegl
>>>
>>> On Sat, Jul 31, 2010 at 6:10 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
>>>> Hi folks,
>>>>
>>>> I have some data that form a 2x2x15 design, where the 15 level
>>>> variable is a discrete sampling of a ratio variable with clear
>>>> non-linearities (see bottom for a dput() of the means). I came across
>>>> gamm4 tonight and it looks like it will help tackle this data, but I'm
>>>> not sure how to tell it to let the smooth of the 15 level variable
>>>> vary as a function of BOTH of the other predictor variables. As a
>>>> hack, I created a dummy 4 level variable that represented the
>>>> combination of the 2x2 level variables, but I'm not positive that this
>>>> was the right thing to do. Any feedback would be greatly appreciated.
>>>> Here's how I had things set up:
>>>>
>>>>> str(ab)
>>>> 'data.frame':   49668 obs. of  5 variables:
>>>>  $ id       : Factor w/ 20 levels "5","6","7","8",..: 1 1 1 1 1 1 1 1 1 1 ...
>>>>  $ design   : Factor w/ 2 levels "CD","NCD": 1 1 1 1 1 1 1 1 1 1 ...
>>>>  $ ddB      : Factor w/ 2 levels "0ddB","+ddB": 1 1 1 1 1 1 1 1 1 1 ...
>>>>  $ soa      : num  300 300 300 300 300 300 300 300 300 300 ...
>>>>  $ rt       : num  441 373 440 290 221 ...
>>>>  $ g        : Factor w/ 4 levels "+ddB CD","+ddB NCD",..: 3 3 3 3 3 3
>>>> 3 3 3 3 ...
>>>>
>>>> #id is the random effect (human participants)
>>>> #design and ddB are 2-level fixed effects
>>>> #soa is a 15 level fixed effect
>>>> #rt is the data to predict (there are dozens of observations for each
>>>> combination of the random and fixed effects)
>>>>
>>>> #fit using dummy variable "g" to get different soa smooths per
>>>> combination of ddB and design
>>>> fit = gamm4(
>>>>        data = ab
>>>>        , formula = rrt ~ ddB*design+s(soa,by=g)
>>>>        , random = ~ (1|id)
>>>>        , family = 'gaussian'
>>>> )
>>>>
>>>>
>>>> #here's dput() ouput of the 2x2x15 means, revealing that soa is
>>>> clearly non-linear
>>>>> dput(means)
>>>> structure(list(ddB = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L), .Label = c("0ddB", "+ddB"), class = "factor"),
>>>>    design = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
>>>>    1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L,
>>>>    1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
>>>>    2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
>>>>    2L, 2L, 1L, 1L, 2L, 2L), .Label = c("CD", "NCD"), class = "factor"),
>>>>    soa = c(-175, -175, -175, -175, -125, -125, -125, -125, -75,
>>>>    -75, -75, -75, -25, -25, -25, -25, 25, 25, 25, 25, 75, 75,
>>>>    75, 75, 125, 125, 125, 125, 175, 175, 175, 175, 250, 250,
>>>>    250, 250, 350, 350, 350, 350, 450, 450, 450, 450, 550, 550,
>>>>    550, 550, 700, 700, 700, 700, 900, 900, 900, 900, 1100, 1100,
>>>>    1100, 1100), rt = structure(c(444.93273739708, 441.513208123373,
>>>>    471.546335977687, 472.283526755609, 444.056771928409, 442.461563636396,
>>>>    472.166623352385, 474.462330421383, 445.513188139913, 444.966221088285,
>>>>    475.669898472348, 461.315736508479, 442.982086750377, 435.355068015326,
>>>>    458.89264807265, 455.638816561315, 434.340063293089, 415.709247937572,
>>>>    450.359216604288, 438.114012378908, 420.982354512772, 404.811090521404,
>>>>    442.721222728774, 430.421981079446, 406.373845346035, 393.411137886923,
>>>>    439.687373941982, 427.611295261772, 398.185439780545, 384.007719475387,
>>>>    429.451304040456, 431.531830923294, 390.486982286177, 380.880066145206,
>>>>    431.287499227013, 435.926925139256, 382.937376664574, 382.020452210116,
>>>>    439.606939778423, 441.751714989440, 384.546631449636, 385.477927260379,
>>>>    440.7286749068, 442.790235121405, 387.589278670798, 386.68589658312,
>>>>    440.384534575679, 440.376618739428, 392.649929780933, 392.496542853778,
>>>>    442.673509587221, 446.148838198613, 401.042352162140, 394.475542684099,
>>>>    441.072702415870, 440.157897723193, 406.059905100442, 398.292572833949,
>>>>    443.899793077701, 445.715779415161), .Dim = c(60L, 1L), .Dimnames = list(
>>>>        NULL, NULL))), .Names = c("ddB", "design", "soa", "rt"
>>>> ), row.names = c(NA, 60L), class = "data.frame")
>>>>
>>>> #visualise the means
>>>> library(ggplot2)
>>>> ggplot(
>>>>        data = means
>>>>        , mapping = aes(
>>>>                x = soa
>>>>                , y = rt
>>>>                , colour = design
>>>>                , linetype = ddB
>>>>        )
>>>> )+
>>>> geom_line()+
>>>> geom_point(shape=21,fill='white')
>>>>
>>>> --
>>>> Mike Lawrence
>>>> Graduate Student
>>>> Department of Psychology
>>>> Dalhousie University
>>>>
>>>> Looking to arrange a meeting? Check my public calendar:
>>>> http://tr.im/mikes_public_calendar
>>>>
>>>> ~ Certainty is folly... I think. ~
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
> --
> Mike Lawrence
> Graduate Student
> Department of Psychology
> Dalhousie University
>
> Looking to arrange a meeting? Check my public calendar:
> http://tr.im/mikes_public_calendar
>
> ~ Certainty is folly... I think. ~
>