[R-sig-ME] Modeling truncated counts with glmer
João C P Santiago
joao.santiago at uni-tuebingen.de
Thu Feb 2 14:10:41 CET 2017
Dear Thierry,
Thank you, that makes sense now! I have been reading more on this and
playing with the data to understand it better. Here are some final
questions:
I've reduced the model to only include the abruf term to simplify things:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.0865 0.1909 -0.4532 0.6504
I(abruf - 1) 1.3241 0.0505 26.2030 0.0000
the probability of answering correctly is given by the equation
-0.0865 + 1.3241(abruf).
* When abruf is zero, the result is the probability of an average
person, on trial 0, answering correctly.
So in this case that means odds of exp(-0.0865)=0.92 and a probability
of plogis(-0.0865)=0.48 (which means on average 0.48*40= 19 correct
pairs)
* for each subsequent trial the odds of answering correctly increase
by exp(1.3241)=3.76 (almost 4x more likely to answer correctly
relative to trial 0) or plogis(1.3241)= 79% increase.
This means our average joe, for example on the last trial (trial=2),
will be -0.0865 + 1.3241*2=2.5617 or exp(2.5617)= 13x more likely to
answer correctly than on trial zero. Which translates to
plogis(2.5617) = 92% probability of success or 37 correct pairs.
* if I want to predict how well a person will do in this test after a
first trial, I simply need to change the intercept? Let's imagine
someone is not very good at this and only gets 25% of the pairs
correctly on the first go. Her or his intercept is log(0.25/0.75) and
on trial 2 the predicted correct number of word-pairs is:
log(0.25/0.75) + 1.3241*2 = 82% or 32 correct pairs.
Again thank you so much for your replies. If you ever come to this
neck of the woods a free beer is in order!
Best,
J Santiago
Quoting Thierry Onkelinx <thierry.onkelinx at inbo.be>:
> Dear João,
>
> The intercept is -0.07376 on the **logit** scale. That is 0.48 on the
> original scale. Use plogis(-0.07376) to transform from logit to original
> scale. Your interpretation of the intercept is correct.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2017-02-01 14:22 GMT+01:00 João C P Santiago <joao.santiago at uni-tuebingen.de
>> :
>
>> Thank you for your input! Only now did I go back to this model.
>>
>> I'm having some doubts about the meaning of the intercept from my binomial
>> model. Here's the complete output:
>>
>> Generalized linear mixed model fit by maximum likelihood (Laplace
>> Approximation) ['glmerMod']
>> Family: binomial ( logit )
>> Formula: cbind(correctPair, incorrectPair) ~ I(abruf - 1) * treatment +
>> version + (1 | subjectNumber)
>> Data: .
>>
>> AIC BIC logLik deviance df.resid
>> 691.4 708.4 -339.7 679.4 119
>>
>> Scaled residuals:
>> Min 1Q Median 3Q Max
>> -3.2676 -0.7861 -0.0428 0.9417 2.7483
>>
>> Random effects:
>> Groups Name Variance Std.Dev.
>> subjectNumber (Intercept) 0.7135 0.8447
>> Number of obs: 125, groups: subjectNumber, 21
>>
>> Fixed effects:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) -0.07376 0.20096 -0.367 0.714
>> I(abruf - 1) 1.30891 0.06904 18.958 <2e-16 ***
>> treatmentStimulation 0.06116 0.09961 0.614 0.539
>> versionB -0.08709 0.07222 -1.206 0.228
>> I(abruf - 1):treatmentStimulation 0.03342 0.09727 0.344 0.731
>> ---
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Correlation of Fixed Effects:
>> (Intr) I(b-1) trtmnS versnB
>> I(abruf-1) -0.235
>> trtmntStmlt -0.254 0.482
>> versionB -0.189 -0.029 0.037
>> I(-1):trtmS 0.164 -0.681 -0.689 0.030
>>
>>
>>
>> abruf has values c(1,2,3) so by -1 it starts at a more meaningful point.
>>
>> My question is: is the intercept the ratio of success/no success on abruf
>> 0, treatment control and version A? If so why is it statistically speaking
>> 1 on the log scale? The number of successes increases from abruf 1 to 3 (as
>> seen by the estimate of the model and plots).
>>
>> It's the first time I'm dealing with such complex models. Thank you for
>> your patience and time.
>>
>> Best
>> J Santiago
>>
>>
>>
>> Quoting Thierry Onkelinx <thierry.onkelinx at inbo.be>:
>>
>> It looks like you participants performed a known number of trials which
>>> resulted in either success or failure. The binomial distribution models
>>> exactly that. The model fit would be the probability of success.
>>>
>>> Once you have the relevant distribution, you can set the relevant
>>> covariates. Which and in which form (linear, polynomial, factor) depends
>>> on
>>> the hypotheses which are relevant for your experiment.
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
>>> Forest
>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>> Kliniekstraat 25
>>> 1070 Anderlecht
>>> Belgium
>>>
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to
>>> say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of
>>> data.
>>> ~ John Tukey
>>>
>>> 2017-01-23 10:01 GMT+01:00 João C P Santiago <
>>> joao.santiago at uni-tuebingen.de
>>>
>>>> :
>>>>
>>>
>>> Thank you! Could you be a bit more specific as to why? I will most likely
>>>> encounter similar data in the future and I want to know how to think
>>>> about
>>>> it.
>>>>
>>>> Fitting the model with abruf as a factor resulted in a better fit, but
>>>> that answers a different question right? Namely how different is the
>>>> intercept at a timepoint in comparison with the main level (abruf 0 in my
>>>> code)?
>>>>
>>>> Best
>>>>
>>>>
>>>> Quoting Thierry Onkelinx <thierry.onkelinx at inbo.be>:
>>>>
>>>> Dear João,
>>>>
>>>>>
>>>>> A binomial distribution seems more relevant to me.
>>>>>
>>>>> glmer(cbind(correctPair, incorrectPair) ~ I((abruf - 1)^2) * treatment +
>>>>> (1|subjectNumber), data=data, family = binomial)
>>>>>
>>>>> Best regards,
>>>>>
>>>>> ir. Thierry Onkelinx
>>>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>>>> and
>>>>> Forest
>>>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>>>> Kliniekstraat 25
>>>>> 1070 Anderlecht
>>>>> Belgium
>>>>>
>>>>> To call in the statistician after the experiment is done may be no more
>>>>> than asking him to perform a post-mortem examination: he may be able to
>>>>> say
>>>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>>> The combination of some data and an aching desire for an answer does not
>>>>> ensure that a reasonable answer can be extracted from a given body of
>>>>> data.
>>>>> ~ John Tukey
>>>>>
>>>>> 2017-01-23 8:46 GMT+01:00 João C P Santiago <
>>>>> joao.santiago at uni-tuebingen.de>
>>>>> :
>>>>>
>>>>> Hi,
>>>>>
>>>>>>
>>>>>> In my experiment 20 participants did a word-pairs learning task in two
>>>>>> conditions (repeated measures):
>>>>>> 40 pairs of nouns are presented on a monitor, each for 4s and with an
>>>>>> interval of 1s. The words of each pair were moderately semantically
>>>>>> related
>>>>>> (e.g., brain, consciousness and solution, problem). Two different word
>>>>>> lists were used for the subject’s two experimental conditions, with the
>>>>>> order of word lists balanced across subjects and conditions. The
>>>>>> subject
>>>>>> had unlimited time to recall the appropriate response word, and did
>>>>>> three
>>>>>> trials in succession for each list:
>>>>>>
>>>>>> Condition 1, List A > T1, T2, T3
>>>>>> Condition 2, List B > T1, T2, T3
>>>>>>
>>>>>> No feedback was given as to whether the remembered word was correct or
>>>>>> not.
>>>>>>
>>>>>> I've seen some people go at this with anova, others subtract the total
>>>>>> number of correct pairs in one condition from the other per subject and
>>>>>> run
>>>>>> a t-test. Since this is count data, a generalized linear model should
>>>>>> be
>>>>>> more appropriate, right?
>>>>>>
>>>>>> head(data)
>>>>>> subjectNumber expDay bmi treatment tones hour abruf
>>>>>> correctPair incorrectPair
>>>>>> <dbl> <chr> <dbl> <fctr> <dbl> <time> <dbl>
>>>>>> <dbl> <dbl>
>>>>>> 1 1 N2 22.53086 Control 0 27900 secs 1
>>>>>> 26 14
>>>>>> 2 1 N2 22.53086 Control 0 27900 secs 2
>>>>>> 40 0
>>>>>> 3 1 N2 22.53086 Control 0 27900 secs 3
>>>>>> 40 0
>>>>>> 4 2 N1 22.53086 Control 0 27900 secs 1
>>>>>> 22 18
>>>>>> 5 2 N1 22.53086 Control 0 27900 secs 2
>>>>>> 33 7
>>>>>> 6 2 N1 22.53086 Control 0 27900 secs 3
>>>>>> 36 4
>>>>>>
>>>>>>
>>>>>>
>>>>>> I fitted a model with glmer.nb(correctPair ~ I((abruf - 1)^2) *
>>>>>> treatment
>>>>>> + (1|subjectNumber), data=data). The residuals don't look so good to me
>>>>>> http://imgur.com/a/AJXGq and the model is fitting values above 40,
>>>>>> which
>>>>>> will never happen in real life (not sure if this is important).
>>>>>>
>>>>>> I'm interested in knowing if there is any difference between conditions
>>>>>> (are the values at timepoint (abruf) 1 different? do people remember
>>>>>> less
>>>>>> in one one condition than in the other (different number of pairs at
>>>>>> timepoint 3?)
>>>>>>
>>>>>>
>>>>>> If the direction I'm taking is completely wrong please let me know.
>>>>>>
>>>>>> Best,
>>>>>> Santiago
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> João C. P. Santiago
>>>>>> Institute for Medical Psychology & Behavioral Neurobiology
>>>>>> Center of Integrative Neuroscience
>>>>>> University of Tuebingen
>>>>>> Otfried-Mueller-Str. 25
>>>>>> 72076 Tuebingen, Germany
>>>>>>
>>>>>> Phone: +49 7071 29 88981
>>>>>> Fax: +49 7071 29 25016
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> João C. P. Santiago
>>>> Institute for Medical Psychology & Behavioral Neurobiology
>>>> Center of Integrative Neuroscience
>>>> University of Tuebingen
>>>> Otfried-Mueller-Str. 25
>>>> 72076 Tuebingen, Germany
>>>>
>>>> Phone: +49 7071 29 88981
>>>> Fax: +49 7071 29 25016
>>>>
>>>>
>>>>
>>
>>
>> --
>> João C. P. Santiago
>> Institute for Medical Psychology & Behavioral Neurobiology
>> Center of Integrative Neuroscience
>> University of Tuebingen
>> Otfried-Mueller-Str. 25
>> 72076 Tuebingen, Germany
>>
>> Phone: +49 7071 29 88981
>> Fax: +49 7071 29 25016
>>
>>
--
João C. P. Santiago
Institute for Medical Psychology & Behavioral Neurobiology
Center of Integrative Neuroscience
University of Tuebingen
Otfried-Mueller-Str. 25
72076 Tuebingen, Germany
Phone: +49 7071 29 88981
Fax: +49 7071 29 25016
More information about the R-sig-mixed-models
mailing list