[R] mgcv::gam(): NA parametric coefficient in a model with two categorical variables + model interpretation

Fotis Fotiadis fotisfotiadis at gmail.com
Mon May 23 00:29:33 CEST 2016


Hallo all

I am using a gam model for my data.

m2.4<-bam(acc~ 1 + igc + s(ctrial, by=igc) + shape + s(ctrial, by=shape) +
s(ctrial, sbj, bs = "fs", m = 1) , data=data, family=binomial)

igc codes condition and there are four levels (CAT.pseudo,
CAT.ideo,PA.pseudo, PA.ideo), and shape is a factor (that cannot be
considered random effect) with four levels too (rand21, rand22, rand23,
rand30).

Here is the summary of the model
> summary(m2.4)

Family: binomial
Link function: logit

Formula:
acc ~ 1 + igc + s(ctrial, by = igc) + shape + s(ctrial, by = shape) +
    s(ctrial, sbj, bs = "fs", m = 1)

Parametric coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    3.5321     0.1930  18.302  < 2e-16 ***
igcCAT.ideo    0.0000     0.0000      NA       NA
igcPA.ideo    -0.3650     0.2441  -1.495   0.1348
igcPA.pseudo  -0.2708     0.2574  -1.052   0.2928
shaperand22   -0.1390     0.1548  -0.898   0.3693
shaperand23    0.3046     0.1670   1.823   0.0682 .
shaperand30   -0.5839     0.1163  -5.020 5.16e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                            edf  Ref.df   Chi.sq  p-value
s(ctrial):igcCAT.pseudo   3.902   4.853   74.787 1.07e-14 ***
s(ctrial):igcCAT.ideo     2.293   2.702   13.794 0.001750 **
s(ctrial):igcPA.ideo      1.000   1.000   11.391 0.000738 ***
s(ctrial):igcPA.pseudo    3.158   3.815   20.411 0.000413 ***
s(ctrial):shaperand21     2.556   3.316   31.387 1.46e-06 ***
s(ctrial):shaperand22     1.000   1.000    0.898 0.343381
s(ctrial):shaperand23     2.304   2.850    6.144 0.118531
s(ctrial):shaperand30     4.952   5.947   27.806 0.000144 ***
s(ctrial,sbj)           221.476 574.000 1502.779  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Rank: 652/655
R-sq.(adj) =  0.405   Deviance explained = 43.9%
fREML =  24003  Scale est. = 1         n = 18417


I am not sure how this model works, but I guess it creates four smooths for
each level of condition, and four smooths for each level of shape.

There is also the intercept of the model, set at the reference level of
condition (CAT.pseudo) and at the reference level of shape (rand21). Each
parametric term represents the difference of each level of each of the two
factors from the intercept.

I have two questions

Q1:
Does anyone now why I get NA results in the second line of the parametric
terms?

Q2:
The term igcCAT.ideo denotes the difference in the intercept between
(A): condition=igcCAT.ideo,  and
(B): (condition=igcCATpseudo ) &(shape=rand21).
But what is the value (level) of shape for (A)?
Is it the reference level? Or is it, perhaps, the "grand mean" of the shape
variable?


Thank you in advance for your time,
Fotis


-- 
PhD Candidate
Department of Philosophy and History of Science
University of Athens, Greece.
http://users.uoa.gr/~aprotopapas/LLL/en/members.html#fotisfotiadis

Notice: Please do not use this account for social networks invitations, for
sending chain-mails to me, or as it were a facebook account. Thank you for
respecting my privacy.

	[[alternative HTML version deleted]]



More information about the R-help mailing list