[R-sig-ME] gamm4 model formulation clarification
Reinhold Kliegl
reinhold.kliegl at gmail.com
Sun Aug 1 08:48:27 CEST 2010
On the difference between the two specifications: Perhaps you have
treatment contrasts for design and soa. Then the fixed effect
coefficients will not estimate the main effect and the interaction.
You would need to specify "contrasts(design) <- contr.sum(2)/2, etc."
for them if you specify the contrast matrix as a I suggested (which
returns estimates of the difference between levels, not as a
difference from the grand mean as the default).
Reinhold Kliegl
On Sun, Aug 1, 2010 at 2:42 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
> Thanks, though I'm not clear on how assigning contrasts to g then
> adding it as a fixed effect is different from the original
> specification, where the smooth is told to vary by g but g isn't in
> the fixed effects; instead I had ddB*design in the fixed effects,
> which I thought would be equivalent to having g and the contrasts you
> suggested. However, it seems that I get different results between the
> two approaches (see below). Any ideas why? I presume I'm missing
> something!
>
> Also, I'm just now thinking that neither approach really gets at what
> I want to test; what I want to test whether the model is improved by
> (1) letting the smooth vary as a function of design, (2) letting the
> smooth vary as a function of ddB, and (3) letting the smooth vary as a
> function of both design & ddB. That is, is there a design*soa
> interaction, a ddB*soa interaction and/or a design*ddB*soa
> interaction? In the case of support for any of these interactions, I'd
> also be interested in pinpointing the timeframe over which the
> interactions take place.
>
> Thoughts?
>
> #here is the "ddB*design+s(soa,by=g)" versus "g+s(soa,by=g)" fits and
> output, showing that they're not identical:
>
>>fit1 = gamm4(
>> data = ab
>> , formula = rt ~ ddB*design+s(soa,by=g)
>> , random = ~ (1|id)
>>)
>>
>> print(fit1$mer,corr=F)
> Linear mixed model fit by REML
> AIC BIC logLik deviance REMLdev
> -564383 -564261 282205 -564567 -564411
> Random effects:
> Groups Name Variance Std.Dev.
> id (Intercept) 6.9251e-08 0.00026316
> Xr.4 s(soa):g0ddB NCD 6.1200e-05 0.00782305
> Xr.3 s(soa):g0ddB CD 1.3823e-03 0.03717992
> Xr.2 s(soa):g+ddB NCD 2.4589e-04 0.01568096
> Xr.1 s(soa):g+ddB CD 2.0301e-03 0.04505718
> Residual 2.0814e-07 0.00045623
> Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
>
> Fixed effects:
> Estimate Std. Error t value
> X(Intercept) 2.455e-03 5.896e-05 41.63
> XddB+ddB 5.104e-05 5.232e-06 9.76
> XdesignNCD -2.665e-04 7.066e-06 -37.72
> XddB+ddB:designNCD -4.450e-05 1.007e-05 -4.42
> Xs(soa):g+ddB CDFx1 -1.592e-04 5.178e-05 -3.07
> Xs(soa):g+ddB NCDFx1 1.099e-04 6.717e-05 1.64
> Xs(soa):g0ddB CDFx1 9.027e-06 4.942e-05 0.18
> Xs(soa):g0ddB NCDFx1 3.618e-05 4.430e-05 0.82
>>
>>
>> summary(fit1$gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> rrt ~ ddB * design + s(soa, by = g)
>
> Parametric coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 2.455e-03 3.706e-06 662.327 < 2e-16 ***
> ddB+ddB 5.104e-05 5.230e-06 9.760 < 2e-16 ***
> designNCD -2.665e-04 7.260e-06 -36.710 < 2e-16 ***
> ddB+ddB:designNCD -4.450e-05 1.005e-05 -4.428 9.55e-06 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Approximate significance of smooth terms:
> edf Ref.df F p-value
> s(soa):g+ddB CD 8.042 8.042 205.327 < 2e-16 ***
> s(soa):g+ddB NCD 5.419 5.419 4.375 0.000361 ***
> s(soa):g0ddB CD 7.795 7.795 213.477 < 2e-16 ***
> s(soa):g0ddB NCD 3.977 3.977 4.461 0.001358 **
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> R-sq.(adj) = 0.0964lmer.REML score = -5.6441e+05 Scale est. =
> 2.0814e-07 n = 45013
>
>>fit2 = gamm4(
>> data = ab
>> , formula = rt ~ g+s(soa,by=g)
>> , random = ~ (1|id)
>>)
>>
>> print(fit2$mer,corr=F)
> Linear mixed model fit by REML
> AIC BIC logLik deviance REMLdev
> -564381 -564259 282205 -564567 -564409
> Random effects:
> Groups Name Variance Std.Dev.
> id (Intercept) 6.9252e-08 0.00026316
> Xr.4 s(Ssoa):g0ddB NCD 6.1196e-05 0.00782278
> Xr.3 s(Ssoa):g0ddB CD 1.3824e-03 0.03718002
> Xr.2 s(Ssoa):g+ddB NCD 2.4589e-04 0.01568093
> Xr.1 s(Ssoa):g+ddB CD 2.0302e-03 0.04505719
> Residual 2.0814e-07 0.00045623
> Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
>
> Fixed effects:
> Estimate Std. Error t value
> X(Intercept) 2.336e-03 5.890e-05 39.66
> Xg.34-12 -2.879e-05 5.035e-06 -5.72
> Xg.24-13 -2.887e-04 5.032e-06 -57.38
> Xg.14-23 2.225e-05 5.035e-06 4.42
> Xs(soa):g+ddB CDFx1 -1.592e-04 5.178e-05 -3.07
> Xs(soa):g+ddB NCDFx1 1.099e-04 6.717e-05 1.64
> Xs(soa):g0ddB CDFx1 9.027e-06 4.942e-05 0.18
> Xs(soa):g0ddB NCDFx1 3.618e-05 4.430e-05 0.82
>>
>>
>> summary(fit2$gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> rrt ~ g + s(soa, by = g)
>
> Parametric coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 2.336e-03 2.637e-06 885.684 < 2e-16 ***
> g.34-12 -2.879e-05 5.025e-06 -5.728 1.02e-08 ***
> g.24-13 -2.887e-04 5.275e-06 -54.741 < 2e-16 ***
> g.14-23 2.225e-05 5.025e-06 4.428 9.55e-06 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Approximate significance of smooth terms:
> edf Ref.df F p-value
> s(soa):g+ddB CD 8.042 8.042 205.328 < 2e-16 ***
> s(soa):g+ddB NCD 5.419 5.419 4.375 0.000361 ***
> s(soa):g0ddB CD 7.795 7.795 213.477 < 2e-16 ***
> s(soa):g0ddB NCD 3.977 3.977 4.462 0.001357 **
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> R-sq.(adj) = 0.0964lmer.REML score = -5.6441e+05 Scale est. =
> 2.0814e-07 n = 45013
>
>
>
> On Sat, Jul 31, 2010 at 6:47 PM, Reinhold Kliegl
> <reinhold.kliegl at gmail.com> wrote:
>> Sorry, the following line should be:
>> constrasts(ab$g) <- cmat
>>
>> On Sat, Jul 31, 2010 at 11:30 PM, Reinhold Kliegl
>> <reinhold.kliegl at gmail.com> wrote:
>>> Perhaps assign contrasts to g. For example,
>>> # ... ... contrast estimates
>>> cmat <- matrix(c( -1/2, -1/2, +1/2, +1/2, # Main effect 1
>>> -1/2, +1/2, -1/2, +1/2, # Main effect 2
>>> +1/2, -1/2, -1/2, +1/2), 4, 3, # Interaction
>>> dimnames=list(c("A1", "A2", "A3", "A4"),
>>> c(".34-12", ".24-13", ".14-23")))
>>>
>>> constrasts(g) <- cmat
>>> fit = gamm4(
>>> data = ab
>>> , formula = rrt ~ g+s(soa,by=g)
>>> , random = ~ (1|id)
>>> , family = 'gaussian'
>>> )
>>>
>>> Reinhold Kliegl
>>>
>>> On Sat, Jul 31, 2010 at 6:10 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
>>>> Hi folks,
>>>>
>>>> I have some data that form a 2x2x15 design, where the 15 level
>>>> variable is a discrete sampling of a ratio variable with clear
>>>> non-linearities (see bottom for a dput() of the means). I came across
>>>> gamm4 tonight and it looks like it will help tackle this data, but I'm
>>>> not sure how to tell it to let the smooth of the 15 level variable
>>>> vary as a function of BOTH of the other predictor variables. As a
>>>> hack, I created a dummy 4 level variable that represented the
>>>> combination of the 2x2 level variables, but I'm not positive that this
>>>> was the right thing to do. Any feedback would be greatly appreciated.
>>>> Here's how I had things set up:
>>>>
>>>>> str(ab)
>>>> 'data.frame': 49668 obs. of 5 variables:
>>>> $ id : Factor w/ 20 levels "5","6","7","8",..: 1 1 1 1 1 1 1 1 1 1 ...
>>>> $ design : Factor w/ 2 levels "CD","NCD": 1 1 1 1 1 1 1 1 1 1 ...
>>>> $ ddB : Factor w/ 2 levels "0ddB","+ddB": 1 1 1 1 1 1 1 1 1 1 ...
>>>> $ soa : num 300 300 300 300 300 300 300 300 300 300 ...
>>>> $ rt : num 441 373 440 290 221 ...
>>>> $ g : Factor w/ 4 levels "+ddB CD","+ddB NCD",..: 3 3 3 3 3 3
>>>> 3 3 3 3 ...
>>>>
>>>> #id is the random effect (human participants)
>>>> #design and ddB are 2-level fixed effects
>>>> #soa is a 15 level fixed effect
>>>> #rt is the data to predict (there are dozens of observations for each
>>>> combination of the random and fixed effects)
>>>>
>>>> #fit using dummy variable "g" to get different soa smooths per
>>>> combination of ddB and design
>>>> fit = gamm4(
>>>> data = ab
>>>> , formula = rrt ~ ddB*design+s(soa,by=g)
>>>> , random = ~ (1|id)
>>>> , family = 'gaussian'
>>>> )
>>>>
>>>>
>>>> #here's dput() ouput of the 2x2x15 means, revealing that soa is
>>>> clearly non-linear
>>>>> dput(means)
>>>> structure(list(ddB = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>>> 2L, 1L, 2L, 1L, 2L), .Label = c("0ddB", "+ddB"), class = "factor"),
>>>> design = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
>>>> 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L,
>>>> 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
>>>> 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
>>>> 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("CD", "NCD"), class = "factor"),
>>>> soa = c(-175, -175, -175, -175, -125, -125, -125, -125, -75,
>>>> -75, -75, -75, -25, -25, -25, -25, 25, 25, 25, 25, 75, 75,
>>>> 75, 75, 125, 125, 125, 125, 175, 175, 175, 175, 250, 250,
>>>> 250, 250, 350, 350, 350, 350, 450, 450, 450, 450, 550, 550,
>>>> 550, 550, 700, 700, 700, 700, 900, 900, 900, 900, 1100, 1100,
>>>> 1100, 1100), rt = structure(c(444.93273739708, 441.513208123373,
>>>> 471.546335977687, 472.283526755609, 444.056771928409, 442.461563636396,
>>>> 472.166623352385, 474.462330421383, 445.513188139913, 444.966221088285,
>>>> 475.669898472348, 461.315736508479, 442.982086750377, 435.355068015326,
>>>> 458.89264807265, 455.638816561315, 434.340063293089, 415.709247937572,
>>>> 450.359216604288, 438.114012378908, 420.982354512772, 404.811090521404,
>>>> 442.721222728774, 430.421981079446, 406.373845346035, 393.411137886923,
>>>> 439.687373941982, 427.611295261772, 398.185439780545, 384.007719475387,
>>>> 429.451304040456, 431.531830923294, 390.486982286177, 380.880066145206,
>>>> 431.287499227013, 435.926925139256, 382.937376664574, 382.020452210116,
>>>> 439.606939778423, 441.751714989440, 384.546631449636, 385.477927260379,
>>>> 440.7286749068, 442.790235121405, 387.589278670798, 386.68589658312,
>>>> 440.384534575679, 440.376618739428, 392.649929780933, 392.496542853778,
>>>> 442.673509587221, 446.148838198613, 401.042352162140, 394.475542684099,
>>>> 441.072702415870, 440.157897723193, 406.059905100442, 398.292572833949,
>>>> 443.899793077701, 445.715779415161), .Dim = c(60L, 1L), .Dimnames = list(
>>>> NULL, NULL))), .Names = c("ddB", "design", "soa", "rt"
>>>> ), row.names = c(NA, 60L), class = "data.frame")
>>>>
>>>> #visualise the means
>>>> library(ggplot2)
>>>> ggplot(
>>>> data = means
>>>> , mapping = aes(
>>>> x = soa
>>>> , y = rt
>>>> , colour = design
>>>> , linetype = ddB
>>>> )
>>>> )+
>>>> geom_line()+
>>>> geom_point(shape=21,fill='white')
>>>>
>>>> --
>>>> Mike Lawrence
>>>> Graduate Student
>>>> Department of Psychology
>>>> Dalhousie University
>>>>
>>>> Looking to arrange a meeting? Check my public calendar:
>>>> http://tr.im/mikes_public_calendar
>>>>
>>>> ~ Certainty is folly... I think. ~
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
> --
> Mike Lawrence
> Graduate Student
> Department of Psychology
> Dalhousie University
>
> Looking to arrange a meeting? Check my public calendar:
> http://tr.im/mikes_public_calendar
>
> ~ Certainty is folly... I think. ~
>
More information about the R-sig-mixed-models
mailing list