[R-sig-ME] Random effects in multinomial regression in R?

Souheyla GHEBGHOUB @ouhey|@@ghebghoub @end|ng |rom gm@||@com
Wed Mar 20 22:46:01 CET 2019


Dear Philip,

Thank you for the clarification. I agree.
So does your response intend that I should just do (1|Subject) ?

Thanks again
Souheyla

On Wed, 20 Mar 2019, 18:51 Alday, Phillip, <Phillip.Alday using mpi.nl> wrote:

> Please keep the list in CC.
>
> I really can't provide more advice about whether to do an intercept-only
> model or include the Pretest score in the random effects without knowing
> more about your data. If you have multiple pretest scores per subject and
> word, then it might make sense to include them in the random effects,
> *depending on your data and research question*. If you don't, then it
> definitely doesn't make sense to estimate a slope (i.e a rate) from a
> single static observation.
>
> Phillip
> On 20/3/19 7:33 pm, Souheyla GHEBGHOUB wrote:
>
> Hi again Phillip,
>
> My question is  :  I'd like to add random effects of *Subject* and *Word*,
> which may differ by time from pretest to posttest, but I don't have effect
> of *Time* , so I can't do:
>
> mod1 <- brm(Change ~ Pretest + Group + (Time|Subject) + (Time|Word))  So should I just do (1|Subject) + (1|Word))  or  (Pretest|Subject) + (Pretest|Word)) or exclude random effects?
>
> Thank you for looking into this :)
>
> Souheyla
>
>
>
>
>
> On Wed, 20 Mar 2019 at 18:28, Phillip Alday <phillip.alday using mpi.nl> wrote:
>
>> On 20/3/19 6:39 pm, Souheyla GHEBGHOUB wrote:
>> > Hi Philip,
>> >
>> > Thank you for the clarification. But I might have not make it clear in
>> > my question.
>> >
>> > I don't have Time in my data at all because I chose to predict change
>> > rather than having posttest and pretest responses as DV and Time as
>> > fixed effect.
>>
>> If Time isn't in your difference data, then it really makes no sense to
>> have it in your model anywhere ....
>>
>> > I chose this way because I have groups of subjects who were tested on
>> > words, and I was not too sure whether, a simple regression with
>> > Responses as DV and Time (Pretest/Posttest) as IV , will take into
>> > account differences between Pretest and Posttest at the level of each
>> > word. That is, I don't know whether it will sum the overall pretest
>> > score of each subject then compare it to its posttest, while I want it
>> > to compare each subject score of each word from pretest to posttest then
>> > base its analysis on these score changes.
>>
>> I don't want to be too harsh, but if you were unsure about that, then
>> that's the question you should have asked first. (See also the XY
>> problem, https://en.wikipedia.org/wiki/XY_problem)
>>
>> >
>> > That's why I did not want to risk it and chose /score change/ as the DV
>> > instead. But I was faced with another problem which is absence of Time
>> > effect by which subjects differ for my random slopes?
>>
>> Assuming you want to compute the difference outside of the model, then
>> you could (and I would argue should) still use the continuous/numeric
>> difference and not a categorical thresholding of that difference as your
>> dependent variable.
>>
>> In that case, I would argue that there can't be a "Time" effect by
>> subject because you are measuring the difference, which incorporates the
>> variance at each Time in the variance of the difference. Same for word.
>>
>> Depending on the exact structure of the test and whether there are
>> multiple pretest scores by subject or by word, you could potentially
>> include that as a random slope, but to make a more precise
>> recommendation there, we need to know more about your data.
>>
>> Best,
>> Phillip
>>
>>
>>
>>
>>
>> >
>> > Best,
>> > Souheyla
>> >
>> > On Wed, 20 Mar 2019 at 17:02, Phillip Alday <phillip.alday using mpi.nl
>> > <mailto:phillip.alday using mpi.nl>> wrote:
>> >
>> >     Generally speaking for the parameterization of mixed-effects models
>> in
>> >     lme4/brms/the usual packages, it doesn't make sense to have a
>> varying
>> >     slope (e.g. Time|Subject) without the corresponding fixed effect.
>> This
>> >     is because the varying slopes are calculated as offsets from the
>> group
>> >     mean, i.e from the fixed effect estimate. Not doing including the
>> fixed
>> >     effect is equivalent to assuming the group mean is zero, which is
>> >     usually not the assumption you want to make.
>> >
>> >     If you fit models with random slopes without the corresponding fixed
>> >     effects, then there are two main problems:
>> >
>> >     1. The corresponding variance parameter will be mis-estimated
>> because it
>> >     will be the average squared distance to zero and not the average
>> squared
>> >     distance to the mean (and average squared distance to the mean is
>> the
>> >     definition of variance).
>> >
>> >     2. The model may not converge because the numerics are set up under
>> the
>> >     "zero mean" assumption. For lme4/nlme, this is the case, but I
>> believe
>> >     that brms may do some internal reparameterization that may avoid
>> these
>> >     difficulties. (And a model fit with MCMC (brms) may not have the
>> same
>> >     numerical issues as a model fit with MLE (lme4)).
>> >
>> >     In brief: just add time as a fixed effect.
>> >
>> >     Also: why not fit your model as a continuous model with pre vs.
>> post as
>> >     a contrast in the model rather reducing a continuous variable to a
>> >     category? You can still apply a categorical distinction afterwards
>> if
>> >     you so desire, but in my experience, it's best to defer making
>> things
>> >     categorical until as late as possible (see also Frank Harrel's
>> comments
>> >     on prediction vs. classification:
>> >     http://www.fharrell.com/post/classification/). Moreover, it's a lot
>> >     easier to fit a continuous model than a multinomial one ....
>> >
>> >     Best,
>> >     Phillip
>> >
>> >     On 18/3/19 7:11 pm, Souheyla GHEBGHOUB wrote:
>> >     > I have *Change* from Pretest to Posttest (gain, no_gain, decline)
>> >     as the
>> >     > DV. Also *Pretest* and *Group* as covariates. This called for a
>> >     multinomial
>> >     > regression:
>> >     >
>> >     > mod0 <- brm(Change ~ Pretest + Group)
>> >     >
>> >     > *Question: *I'd like to add random effects of *Subject* and
>> >     *Word*, which
>> >     > may differ by time, but I don't have effect of *Time* to do:
>> >     >
>> >     > mod1 <- brm(Change ~ Pretest + Group + (Time|Subject) +
>> (Time|Word))
>> >     >
>> >     > So I thought of this:
>> >     >
>> >     > mod2 <- brm(Change ~ Pretest + Group + (1|Subject) + (1|Word))
>> >     >
>> >     > but this also seems wrong to me. What do you think is the best way
>> >     to treat
>> >     > random effects in this situation, please?
>> >     >
>> >     > Thank you
>> >     >
>> >     > Souheyla Ghebghoub
>> >     >
>> >     >       [[alternative HTML version deleted]]
>> >     >
>> >     > _______________________________________________
>> >     > R-sig-mixed-models using r-project.org
>> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
>> >     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >     >
>> >
>>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list