[R] mgcv::gam(): NA parametric coefficient in a model with two categorical variables + model interpretation
Simon Wood
simon.wood at bath.edu
Mon May 23 15:32:06 CEST 2016
Q1: It looks like the model is not fully identifiably given the data and
as a result igcCAT.ideo has been set to zero - there is no sensible test
to conduct with such a term, hence the NAs in the test stat an p-value
fields.
Q2: A separate (centred) smooth is estimated for each level of igc. If
you want a baseline (igcCAT.pseudo) smooth, and difference smooths for
the rest of the levels of igc then you need to set igc to be an ordered
factor, and use something like...
~ igc + s(ctrial) + s(ctrial,by=igc)
- see section on `by' variables in ?gam.models.
best,
Simon
On 22/05/16 23:29, Fotis Fotiadis wrote:
> Hallo all
>
> I am using a gam model for my data.
>
> m2.4<-bam(acc~ 1 + igc + s(ctrial, by=igc) + shape + s(ctrial, by=shape) +
> s(ctrial, sbj, bs = "fs", m = 1) , data=data, family=binomial)
>
> igc codes condition and there are four levels (CAT.pseudo,
> CAT.ideo,PA.pseudo, PA.ideo), and shape is a factor (that cannot be
> considered random effect) with four levels too (rand21, rand22, rand23,
> rand30).
>
> Here is the summary of the model
>> summary(m2.4)
> Family: binomial
> Link function: logit
>
> Formula:
> acc ~ 1 + igc + s(ctrial, by = igc) + shape + s(ctrial, by = shape) +
> s(ctrial, sbj, bs = "fs", m = 1)
>
> Parametric coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 3.5321 0.1930 18.302 < 2e-16 ***
> igcCAT.ideo 0.0000 0.0000 NA NA
> igcPA.ideo -0.3650 0.2441 -1.495 0.1348
> igcPA.pseudo -0.2708 0.2574 -1.052 0.2928
> shaperand22 -0.1390 0.1548 -0.898 0.3693
> shaperand23 0.3046 0.1670 1.823 0.0682 .
> shaperand30 -0.5839 0.1163 -5.020 5.16e-07 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Approximate significance of smooth terms:
> edf Ref.df Chi.sq p-value
> s(ctrial):igcCAT.pseudo 3.902 4.853 74.787 1.07e-14 ***
> s(ctrial):igcCAT.ideo 2.293 2.702 13.794 0.001750 **
> s(ctrial):igcPA.ideo 1.000 1.000 11.391 0.000738 ***
> s(ctrial):igcPA.pseudo 3.158 3.815 20.411 0.000413 ***
> s(ctrial):shaperand21 2.556 3.316 31.387 1.46e-06 ***
> s(ctrial):shaperand22 1.000 1.000 0.898 0.343381
> s(ctrial):shaperand23 2.304 2.850 6.144 0.118531
> s(ctrial):shaperand30 4.952 5.947 27.806 0.000144 ***
> s(ctrial,sbj) 221.476 574.000 1502.779 < 2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Rank: 652/655
> R-sq.(adj) = 0.405 Deviance explained = 43.9%
> fREML = 24003 Scale est. = 1 n = 18417
>
>
> I am not sure how this model works, but I guess it creates four smooths for
> each level of condition, and four smooths for each level of shape.
>
> There is also the intercept of the model, set at the reference level of
> condition (CAT.pseudo) and at the reference level of shape (rand21). Each
> parametric term represents the difference of each level of each of the two
> factors from the intercept.
>
> I have two questions
>
> Q1:
> Does anyone now why I get NA results in the second line of the parametric
> terms?
>
> Q2:
> The term igcCAT.ideo denotes the difference in the intercept between
> (A): condition=igcCAT.ideo, and
> (B): (condition=igcCATpseudo ) &(shape=rand21).
> But what is the value (level) of shape for (A)?
> Is it the reference level? Or is it, perhaps, the "grand mean" of the shape
> variable?
>
>
> Thank you in advance for your time,
> Fotis
>
>
--
Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
+44 (0)117 33 18273 http://www.maths.bris.ac.uk/~sw15190
More information about the R-help
mailing list