[R-sig-ME] gamm4 model formulation clarification
Mike Lawrence
Mike.Lawrence at dal.ca
Sun Aug 1 02:42:14 CEST 2010
Thanks, though I'm not clear on how assigning contrasts to g then
adding it as a fixed effect is different from the original
specification, where the smooth is told to vary by g but g isn't in
the fixed effects; instead I had ddB*design in the fixed effects,
which I thought would be equivalent to having g and the contrasts you
suggested. However, it seems that I get different results between the
two approaches (see below). Any ideas why? I presume I'm missing
something!
Also, I'm just now thinking that neither approach really gets at what
I want to test; what I want to test whether the model is improved by
(1) letting the smooth vary as a function of design, (2) letting the
smooth vary as a function of ddB, and (3) letting the smooth vary as a
function of both design & ddB. That is, is there a design*soa
interaction, a ddB*soa interaction and/or a design*ddB*soa
interaction? In the case of support for any of these interactions, I'd
also be interested in pinpointing the timeframe over which the
interactions take place.
Thoughts?
#here is the "ddB*design+s(soa,by=g)" versus "g+s(soa,by=g)" fits and
output, showing that they're not identical:
>fit1 = gamm4(
> data = ab
> , formula = rt ~ ddB*design+s(soa,by=g)
> , random = ~ (1|id)
>)
>
> print(fit1$mer,corr=F)
Linear mixed model fit by REML
AIC BIC logLik deviance REMLdev
-564383 -564261 282205 -564567 -564411
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 6.9251e-08 0.00026316
Xr.4 s(soa):g0ddB NCD 6.1200e-05 0.00782305
Xr.3 s(soa):g0ddB CD 1.3823e-03 0.03717992
Xr.2 s(soa):g+ddB NCD 2.4589e-04 0.01568096
Xr.1 s(soa):g+ddB CD 2.0301e-03 0.04505718
Residual 2.0814e-07 0.00045623
Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
Fixed effects:
Estimate Std. Error t value
X(Intercept) 2.455e-03 5.896e-05 41.63
XddB+ddB 5.104e-05 5.232e-06 9.76
XdesignNCD -2.665e-04 7.066e-06 -37.72
XddB+ddB:designNCD -4.450e-05 1.007e-05 -4.42
Xs(soa):g+ddB CDFx1 -1.592e-04 5.178e-05 -3.07
Xs(soa):g+ddB NCDFx1 1.099e-04 6.717e-05 1.64
Xs(soa):g0ddB CDFx1 9.027e-06 4.942e-05 0.18
Xs(soa):g0ddB NCDFx1 3.618e-05 4.430e-05 0.82
>
>
> summary(fit1$gam)
Family: gaussian
Link function: identity
Formula:
rrt ~ ddB * design + s(soa, by = g)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.455e-03 3.706e-06 662.327 < 2e-16 ***
ddB+ddB 5.104e-05 5.230e-06 9.760 < 2e-16 ***
designNCD -2.665e-04 7.260e-06 -36.710 < 2e-16 ***
ddB+ddB:designNCD -4.450e-05 1.005e-05 -4.428 9.55e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(soa):g+ddB CD 8.042 8.042 205.327 < 2e-16 ***
s(soa):g+ddB NCD 5.419 5.419 4.375 0.000361 ***
s(soa):g0ddB CD 7.795 7.795 213.477 < 2e-16 ***
s(soa):g0ddB NCD 3.977 3.977 4.461 0.001358 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.0964lmer.REML score = -5.6441e+05 Scale est. =
2.0814e-07 n = 45013
>fit2 = gamm4(
> data = ab
> , formula = rt ~ g+s(soa,by=g)
> , random = ~ (1|id)
>)
>
> print(fit2$mer,corr=F)
Linear mixed model fit by REML
AIC BIC logLik deviance REMLdev
-564381 -564259 282205 -564567 -564409
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 6.9252e-08 0.00026316
Xr.4 s(Ssoa):g0ddB NCD 6.1196e-05 0.00782278
Xr.3 s(Ssoa):g0ddB CD 1.3824e-03 0.03718002
Xr.2 s(Ssoa):g+ddB NCD 2.4589e-04 0.01568093
Xr.1 s(Ssoa):g+ddB CD 2.0302e-03 0.04505719
Residual 2.0814e-07 0.00045623
Number of obs: 45013, groups: id, 20; Xr.4, 8; Xr.3, 8; Xr.2, 8; Xr.1, 8
Fixed effects:
Estimate Std. Error t value
X(Intercept) 2.336e-03 5.890e-05 39.66
Xg.34-12 -2.879e-05 5.035e-06 -5.72
Xg.24-13 -2.887e-04 5.032e-06 -57.38
Xg.14-23 2.225e-05 5.035e-06 4.42
Xs(soa):g+ddB CDFx1 -1.592e-04 5.178e-05 -3.07
Xs(soa):g+ddB NCDFx1 1.099e-04 6.717e-05 1.64
Xs(soa):g0ddB CDFx1 9.027e-06 4.942e-05 0.18
Xs(soa):g0ddB NCDFx1 3.618e-05 4.430e-05 0.82
>
>
> summary(fit2$gam)
Family: gaussian
Link function: identity
Formula:
rrt ~ g + s(soa, by = g)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.336e-03 2.637e-06 885.684 < 2e-16 ***
g.34-12 -2.879e-05 5.025e-06 -5.728 1.02e-08 ***
g.24-13 -2.887e-04 5.275e-06 -54.741 < 2e-16 ***
g.14-23 2.225e-05 5.025e-06 4.428 9.55e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(soa):g+ddB CD 8.042 8.042 205.328 < 2e-16 ***
s(soa):g+ddB NCD 5.419 5.419 4.375 0.000361 ***
s(soa):g0ddB CD 7.795 7.795 213.477 < 2e-16 ***
s(soa):g0ddB NCD 3.977 3.977 4.462 0.001357 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.0964lmer.REML score = -5.6441e+05 Scale est. =
2.0814e-07 n = 45013
On Sat, Jul 31, 2010 at 6:47 PM, Reinhold Kliegl
<reinhold.kliegl at gmail.com> wrote:
> Sorry, the following line should be:
> constrasts(ab$g) <- cmat
>
> On Sat, Jul 31, 2010 at 11:30 PM, Reinhold Kliegl
> <reinhold.kliegl at gmail.com> wrote:
>> Perhaps assign contrasts to g. For example,
>> # ... ... contrast estimates
>> cmat <- matrix(c( -1/2, -1/2, +1/2, +1/2, # Main effect 1
>> -1/2, +1/2, -1/2, +1/2, # Main effect 2
>> +1/2, -1/2, -1/2, +1/2), 4, 3, # Interaction
>> dimnames=list(c("A1", "A2", "A3", "A4"),
>> c(".34-12", ".24-13", ".14-23")))
>>
>> constrasts(g) <- cmat
>> fit = gamm4(
>> data = ab
>> , formula = rrt ~ g+s(soa,by=g)
>> , random = ~ (1|id)
>> , family = 'gaussian'
>> )
>>
>> Reinhold Kliegl
>>
>> On Sat, Jul 31, 2010 at 6:10 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
>>> Hi folks,
>>>
>>> I have some data that form a 2x2x15 design, where the 15 level
>>> variable is a discrete sampling of a ratio variable with clear
>>> non-linearities (see bottom for a dput() of the means). I came across
>>> gamm4 tonight and it looks like it will help tackle this data, but I'm
>>> not sure how to tell it to let the smooth of the 15 level variable
>>> vary as a function of BOTH of the other predictor variables. As a
>>> hack, I created a dummy 4 level variable that represented the
>>> combination of the 2x2 level variables, but I'm not positive that this
>>> was the right thing to do. Any feedback would be greatly appreciated.
>>> Here's how I had things set up:
>>>
>>>> str(ab)
>>> 'data.frame': 49668 obs. of 5 variables:
>>> $ id : Factor w/ 20 levels "5","6","7","8",..: 1 1 1 1 1 1 1 1 1 1 ...
>>> $ design : Factor w/ 2 levels "CD","NCD": 1 1 1 1 1 1 1 1 1 1 ...
>>> $ ddB : Factor w/ 2 levels "0ddB","+ddB": 1 1 1 1 1 1 1 1 1 1 ...
>>> $ soa : num 300 300 300 300 300 300 300 300 300 300 ...
>>> $ rt : num 441 373 440 290 221 ...
>>> $ g : Factor w/ 4 levels "+ddB CD","+ddB NCD",..: 3 3 3 3 3 3
>>> 3 3 3 3 ...
>>>
>>> #id is the random effect (human participants)
>>> #design and ddB are 2-level fixed effects
>>> #soa is a 15 level fixed effect
>>> #rt is the data to predict (there are dozens of observations for each
>>> combination of the random and fixed effects)
>>>
>>> #fit using dummy variable "g" to get different soa smooths per
>>> combination of ddB and design
>>> fit = gamm4(
>>> data = ab
>>> , formula = rrt ~ ddB*design+s(soa,by=g)
>>> , random = ~ (1|id)
>>> , family = 'gaussian'
>>> )
>>>
>>>
>>> #here's dput() ouput of the 2x2x15 means, revealing that soa is
>>> clearly non-linear
>>>> dput(means)
>>> structure(list(ddB = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>>> 2L, 1L, 2L, 1L, 2L), .Label = c("0ddB", "+ddB"), class = "factor"),
>>> design = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
>>> 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L,
>>> 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
>>> 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
>>> 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("CD", "NCD"), class = "factor"),
>>> soa = c(-175, -175, -175, -175, -125, -125, -125, -125, -75,
>>> -75, -75, -75, -25, -25, -25, -25, 25, 25, 25, 25, 75, 75,
>>> 75, 75, 125, 125, 125, 125, 175, 175, 175, 175, 250, 250,
>>> 250, 250, 350, 350, 350, 350, 450, 450, 450, 450, 550, 550,
>>> 550, 550, 700, 700, 700, 700, 900, 900, 900, 900, 1100, 1100,
>>> 1100, 1100), rt = structure(c(444.93273739708, 441.513208123373,
>>> 471.546335977687, 472.283526755609, 444.056771928409, 442.461563636396,
>>> 472.166623352385, 474.462330421383, 445.513188139913, 444.966221088285,
>>> 475.669898472348, 461.315736508479, 442.982086750377, 435.355068015326,
>>> 458.89264807265, 455.638816561315, 434.340063293089, 415.709247937572,
>>> 450.359216604288, 438.114012378908, 420.982354512772, 404.811090521404,
>>> 442.721222728774, 430.421981079446, 406.373845346035, 393.411137886923,
>>> 439.687373941982, 427.611295261772, 398.185439780545, 384.007719475387,
>>> 429.451304040456, 431.531830923294, 390.486982286177, 380.880066145206,
>>> 431.287499227013, 435.926925139256, 382.937376664574, 382.020452210116,
>>> 439.606939778423, 441.751714989440, 384.546631449636, 385.477927260379,
>>> 440.7286749068, 442.790235121405, 387.589278670798, 386.68589658312,
>>> 440.384534575679, 440.376618739428, 392.649929780933, 392.496542853778,
>>> 442.673509587221, 446.148838198613, 401.042352162140, 394.475542684099,
>>> 441.072702415870, 440.157897723193, 406.059905100442, 398.292572833949,
>>> 443.899793077701, 445.715779415161), .Dim = c(60L, 1L), .Dimnames = list(
>>> NULL, NULL))), .Names = c("ddB", "design", "soa", "rt"
>>> ), row.names = c(NA, 60L), class = "data.frame")
>>>
>>> #visualise the means
>>> library(ggplot2)
>>> ggplot(
>>> data = means
>>> , mapping = aes(
>>> x = soa
>>> , y = rt
>>> , colour = design
>>> , linetype = ddB
>>> )
>>> )+
>>> geom_line()+
>>> geom_point(shape=21,fill='white')
>>>
>>> --
>>> Mike Lawrence
>>> Graduate Student
>>> Department of Psychology
>>> Dalhousie University
>>>
>>> Looking to arrange a meeting? Check my public calendar:
>>> http://tr.im/mikes_public_calendar
>>>
>>> ~ Certainty is folly... I think. ~
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar
~ Certainty is folly... I think. ~
More information about the R-sig-mixed-models
mailing list