[R-sig-ME] how to specify the response (dependent) variable in a logistic regression model

Fri Jan 15 23:56:26 CET 2021

    I think you can also do this in lme4 with a little bit more work, see

https://rpubs.com/bbolker/3336
https://mac-theobio.github.io/QMEE/lectures/MultivariateMixed.notes.html


On 1/14/21 6:27 PM, John Kingston wrote:
> Dear Phillip and Greg,
> Thank you both very much.
> 
> I don't have experience yet beyond lme4, but you've both given me useful
> directions to pursue.
> 
> I'll come back with results once they're in hand.
> Best,
> John
> 
> John Kingston
> Professor
> Linguistics Department
> University of Massachusetts
> N434 Integrative Learning Center
> 650 N. Pleasant Street
> Amherst, MA 01003
> 1-413-545-6833, fax -2792
> jkingstn using umass.edu
> https://blogs.umass.edu/jkingstn
> <https://blogs.umass.edu/jkingstn/wp-admin/>
> 
> 
> On Thu, Jan 14, 2021 at 11:41 AM Phillip Alday <me using phillipalday.com> wrote:
> 
>> John,
>>
>> How comfortable are you with mixed models software beyond lme4? This
>> seems like a perfect case for a multivariate mixed model (which you can
>> do with e.g. brms or MCMCglmm). The basic idea is that you do create a
>> single mixed model that can be thought of doing two GLMMs
>> simultaneously. Here's the basic syntax for doing this in brms:
>>
>>
>> brm(mvbind(Resp1, Resp2) ~ preds + ..., data=your_data, family=binomial)
>>
>> You can also specify this as two formulae (which really highlights the
>> "two models simultaneously" intuition):
>>
>>
>> var1 = bf(Resp1 ~ preds + ....) + binomial()
>> var2 = bf(Resp2 ~ preds + ....) + binomial()
>>
>> brm(var1 + var2, data=your_data)
>>
>> The advantage to doing this as a multivariate model as opposed to
>> separate models is that you get simultaneous estimates across both
>> models, including correlation/covariance between those estimates.  See
>> e.g. the brms documentation
>> (https://paul-buerkner.github.io/brms/articles/brms_multivariate.html)
>> for more info. In particular, pay attention to the extra syntax for
>> computing shared correlation in the random effects across sub-models.
>>
>> The cons for this approach are that [1] most reviewers in
>> (psycho)linguistics will not be familiar with it (and there was recent a
>> Twitter storm on this very problem) and [2] the computational costs are
>> noticeably higher.
>>
>> Another alternative is to do something like "linked mixed models" (cf.
>> Hohenstein, Matuschek and Kliegl, PBR 2016). There are a few variants on
>> this, but the basic idea is that you use one response to predict the
>> other. Given the temporal ordering here, this might make sense, e.g.
>>
>> mod1 = glmer(Resp1 ~ preds + ....)
>> mod2 = glmer(Resp2 ~ preds + YYY + ....)
>>
>> where YYY is one of:
>> [a] Resp1
>> [b] fitted(mod1)
>> [c] fitted(mod1) + resid(mod1)
>>
>> You can potentially omit mod1, in which case you have something like the
>> Davidson and Martin (Acta Psychologia, 2016) approach to the joint
>> analysis of reaction times and response accuracy.
>>
>> The downside to this approach is that the variability that's in Resp1
>> can create problems in mod2, because standard GLMMs assume that the
>> predictors are measured without error/variability. Variants [b] and
>> especially [c] mitigate this a bit though. (And if you want to get even
>> more complicated, there are  "errors-within-variables" models, which can
>> handle this and are available in e.g. brms). I think the advantage to
>> the linked model approach relative to the multivariate approach is that
>> it's somewhat more accessible for a typical (psycho)linguistic reviewer.
>>
>> Note that I am nominally originally from linguistics and do know a bit
>> about mixed models, so I'm a good usual suspect for a reviewer on these
>> things.
>>
>> Best,
>> Phillip
>>
>> PS: the multinomial models suggested by the others are also pretty good,
>> but again multinomial models are usually something that require getting
>> used to and doesn't reflect the potential covariance of Resp1 and Resp2
>> in an obvious way.
>>
>>
>>
>> On 14/1/21 5:05 pm, Greg Snow wrote:
>>> John,
>>>
>>> I agree that ordering your responses does not make sense, but the
>>> multinomial models are for unordered categorical data.  So you can
>>> just treat your 4 possible outcomes as unordered categories.
>>>
>>> Another option is to convert to a Poisson regression where the
>>> response variable is the count (number of times each of the 4
>>> combinations is selected) and then your categories become
>>> explanitory/predictor variables.  You can either use a single
>>> predictor with the 4 levels (and choose appropriate indicator
>>> variables) or you can have 2 predictors (b vs w and 1 vs 2) as well as
>>> their interaction.  That would give a different interpretation of the
>>> model, but may be more what you are trying to accomplish.
>>>
>>> On Thu, Jan 14, 2021 at 8:44 AM John Kingston <jkingstn using umass.edu>
>> wrote:
>>>>
>>>> Dear Thierry,
>>>> Thanks for your question. Here's the reason why I think the responses
>>>> aren't multinomial (or ordinal).
>>>>
>>>> The listeners were presented with spoken strings of the form CVC, where
>> C =
>>>> consonant and V = vowel. The rate at which the acoustics changed at the
>>>> beginning of the syllable was varied orthogonally with the duration of
>> the
>>>> vowel. The rate of acoustic change conveyed the identity of the initial
>>>> consonant, which was expected to sound like "b" when the rate of change
>> was
>>>> faster and like "w" when it was slower. The duration of the vowel
>> conveyed
>>>> how many syllables the string consisted of, which was expected to be "1"
>>>> when the vowel was shorter and "2" when the vowel was longer. The
>> listeners
>>>> were instructed to respond with "b" or "w" and "1" or "2" on every
>> trial.
>>>> So, unlike a truly multinomial dependent variable, such as professions
>> or
>>>> majors, the responses here are not unordered. They also cannot be
>> arranged
>>>> into a single order sensibly, because even if "b1" and "w2" responses
>> are
>>>> first and last in the order, there's no way of deciding *a priori* the
>>>> order of "b2" and "w1" responses.
>>>>
>>>> Again, thanks for your reply.
>>>> Best,
>>>> John
>>>> John Kingston
>>>> Professor
>>>> Linguistics Department
>>>> University of Massachusetts
>>>> N434 Integrative Learning Center
>>>> 650 N. Pleasant Street
>>>> Amherst, MA 01003
>>>> 1-413-545-6833, fax -2792
>>>> jkingstn using umass.edu
>>>> https://blogs.umass.edu/jkingstn
>>>> <https://blogs.umass.edu/jkingstn/wp-admin/>
>>>>
>>>>          [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>