[R-sig-ME] lmer: No significant coefficients, but significant improvement of model fit?

r-sig-mixed-models at gjyp.nl r-sig-mixed-models at gjyp.nl
Wed Oct 31 11:14:17 CET 2012


Hey all,

This is my first post - but I assume that like at other lists, brevity is appreciated, so I have a short version and a long version:

SHORT VERSION, QUESTIONS ONLY:

1) how is it possible that using lmer, none of the fixed effects has significant coefficients, yet the model with those parameters fits significantly better than a model without those parameters? Is this an example of why lmer didn\'t use to report p-values for the coefficients?
2) what do the slash and the colon mean exactly when specifying lmer models?

LONG VERSION WITH BACKGROUND:

I am unexperienced with mixed models, but I have a dataset that has several levels that needs to be analysed - and I \'always\' wanted to learn multilevel analysis anyway, so I decided this was a good occasion. However, there are no courses at hand in the near future, so I\'m trying to get there with online resources and some books (such as \"discovering statistics using R\" by Andy Field, and in a slightly different category, the Multilevel Analysis book by Joop and the one by Snijders & Bosker. However, apparently, I lack what it takes to autodidactically learn this :-/ So I apologise, but I decided to draw on your wisdom. I\'m also kind of hoping that doing multilevel analyses is a good way of learning how to do them.

I must admit that I don\'t feel like I master the lmer model formulation, but I found a post by Harold Doran [1] where he explains the lmer syntax. My data file is structured the same as the one he models in fm3, fm4 and fm5. I have the following variables (of interest):
* cannabisUse_bi: a factor with two levels, \"0\" and \"1\". \'0\' indicates no cannabis use in the past week; \'1\' indicates cannabis use in the past week. This is the dependent variable (i.e. the criterion).
* moment: a factor with two levels, \'before\' and \'after\'
* id.factor: a factor with 444 levels, the identification of each participants (note that there are quite a lot of missing values, only about 276 cases without missings)
* school: a factor with 8 levels, each representing the school that the participants attend
* cannabisShow: a factor with 2 levels, \'control\' and \'intervention\' - this reflects whether a participant received the \'intervention\', aimed to decrease cannabis use, or not. Participants in five schools received the intervention; participants in three other schools didn\'t.

Every person provided two datapoints (one before the intervention took place, and one after); there are several persons in a school; and there are several school in each condition (level) of cannabisShow.

As far as I understand, this translates to \"Moment is nested within person (\'id.factor\'), which is nested within school, which is nested within cannabisShow\" (not sure about that last bit).

Harold\'s post has a structure where he has \'year\' within \'student\' within \'school\'. For the sake of convenience, here\'s his model:

fm5 <- lmer(math ~ year +(year|schoolid/childid), egsingle, control=list(gradient = FALSE, niterEM = 0))

if I translate that to my situation, I get (also based on a number of online sources [2]):

rep_measures.new.moment  <- lmer(usedCannabis_bi ~ 1 + moment + (moment|school/id.factor), family=binomial(link = \"logit\"), data=dat.long);

(I left out the control statement, since this causes R to complain about unused parameters; the post was from 2006, so I guess lmer changed or something?)

Now, this model doesn\'t include the effect of the intervention, and if I include that, I get:

rep_measures.new.model <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|school/id.factor), family=binomial(link = \"logit\"), data=dat.long);

If I compare these two models using Anova, the second one fits better (logLik from -182.02 to -166.68, ChiSq = 30.681, Df = 2, p = 2.177e-07). However, when you look at rep_measures.new.model, none of the fixed effects is significant. I may be completely wrong, but doesn\'t this mean that the cannabisShow variable, nor its interaction with measurement moment (i.e. \'time\'), contributes to explaining the dependent variable (i.e. cannabisUse_bi)?

(in fact, I\'m also a bit confused as to the p-values that lmer provides for the fixed effects. I thought that there were good reasons not to - and that lmer wasn\'t supposed to? [3] (I don\'t understand the post - I\'m sadly not a statistician - but I thought I got the gist) Apparently this changed . . . ?)

And now that I\'m mailing anyway: what is the difference between these two models?

rep_measures.new.model.1 <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|school/id.factor), family=binomial(link = \"logit\"), data=dat.long);
rep_measures.new.model.2 <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|id.factor:school), family=binomial(link = \"logit\"), data=dat.long);

R gives slightly (but only slightly) different coefficient estimates; but on the first one, he seems to understand that school is a level (with 8 values), where for the second one, this is apparently not specified . . . What\'s the difference between the slash and the colon for indicating levels (the levels have to be \'the other way around\', apparently?)?

I\'m sorry to bother the list with such basic questions. I\'ve been looking for a tutorial or explanation, but I\'ve only been able to find little bits of information that I pieced together into my current (lack of :-)) understanding . . .

Thank you in advance!

Gjalt-Jorn Peters


PS: I\'ve put the R script at http://sciencerep.org/files/7/the%20cannabis%20show%20-%20analyses.r
(the part I\'m talking about now starts after the line with \"###### Behaviour\", line 195 - the real analyses I\'m talking about now start at line 314)
This .R file downloads the data from http://sciencerep.org/files/7/the%20cannabis%20show%20-%20data.tsv

The output you should get is at http://sciencerep.org/files/7/the%20cannabis%20show%20-%20output.txt
(but the output file is kind of hard to interpret without the analyses file, as I didn\'t \"cat\" all comments)


[1] http://tolstoy.newcastle.edu.au/R/e2/help/06/10/3345.html
[2] http://www.rensenieuwenhuis.nl/r-sessions-17-generalized-multilevel-lme4/
    http://www.talkstats.com/showthread.php/14393-Need-help-with-lmer-model-specification-syntax-for-nested-mixed-model
    http://www.bodowinter.com/tutorial/bw_LME_tutorial.pdf
[3] https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html



More information about the R-sig-mixed-models mailing list