[R-sig-ME] Random effects in multinomial regression in R?
Phillip Alday
ph||||p@@|d@y @end|ng |rom mp|@n|
Thu Mar 21 12:52:16 CET 2019
I honestly don't know because I don't know enough the structure of your
data.
Phillip
On 20/3/19 10:46 pm, Souheyla GHEBGHOUB wrote:
> Dear Philip,
>
> Thank you for the clarification. I agree.
> So does your response intend that I should just do (1|Subject) ?
>
> Thanks again
> Souheyla
>
> On Wed, 20 Mar 2019, 18:51 Alday, Phillip, <Phillip.Alday using mpi.nl
> <mailto:Phillip.Alday using mpi.nl>> wrote:
>
> Please keep the list in CC.
>
> I really can't provide more advice about whether to do an
> intercept-only model or include the Pretest score in the random
> effects without knowing more about your data. If you have multiple
> pretest scores per subject and word, then it might make sense to
> include them in the random effects, *depending on your data and
> research question*. If you don't, then it definitely doesn't make
> sense to estimate a slope (i.e a rate) from a single static
> observation.
>
> Phillip
>
> On 20/3/19 7:33 pm, Souheyla GHEBGHOUB wrote:
>> Hi again Phillip,
>>
>> My question is : I'd like to add random effects
>> of */Subject/* and */Word/*, which may differ by time from pretest
>> to posttest, but I don't have effect of */Time/* , so I can't do:
>> |mod1 <-brm(Change~Pretest+Group+(Time|Subject)+(Time|Word)) |So should I just do (1|Subject)+(1|Word)) or
>> (Pretest|Subject)+(Pretest|Word)) or exclude random effects?
>> Thank you for looking into this :)
>> Souheyla
>> ||
>>
>> On Wed, 20 Mar 2019 at 18:28, Phillip Alday <phillip.alday using mpi.nl
>> <mailto:phillip.alday using mpi.nl>> wrote:
>>
>> On 20/3/19 6:39 pm, Souheyla GHEBGHOUB wrote:
>> > Hi Philip,
>> >
>> > Thank you for the clarification. But I might have not make
>> it clear in
>> > my question.
>> >
>> > I don't have Time in my data at all because I chose to
>> predict change
>> > rather than having posttest and pretest responses as DV and
>> Time as
>> > fixed effect.
>>
>> If Time isn't in your difference data, then it really makes no
>> sense to
>> have it in your model anywhere ....
>>
>> > I chose this way because I have groups of subjects who were
>> tested on
>> > words, and I was not too sure whether, a simple regression with
>> > Responses as DV and Time (Pretest/Posttest) as IV , will
>> take into
>> > account differences between Pretest and Posttest at the
>> level of each
>> > word. That is, I don't know whether it will sum the overall
>> pretest
>> > score of each subject then compare it to its posttest, while
>> I want it
>> > to compare each subject score of each word from pretest to
>> posttest then
>> > base its analysis on these score changes.
>>
>> I don't want to be too harsh, but if you were unsure about
>> that, then
>> that's the question you should have asked first. (See also the XY
>> problem, https://en.wikipedia.org/wiki/XY_problem)
>>
>> >
>> > That's why I did not want to risk it and chose /score
>> change/ as the DV
>> > instead. But I was faced with another problem which is
>> absence of Time
>> > effect by which subjects differ for my random slopes?
>>
>> Assuming you want to compute the difference outside of the
>> model, then
>> you could (and I would argue should) still use the
>> continuous/numeric
>> difference and not a categorical thresholding of that
>> difference as your
>> dependent variable.
>>
>> In that case, I would argue that there can't be a "Time" effect by
>> subject because you are measuring the difference, which
>> incorporates the
>> variance at each Time in the variance of the difference. Same
>> for word.
>>
>> Depending on the exact structure of the test and whether there are
>> multiple pretest scores by subject or by word, you could
>> potentially
>> include that as a random slope, but to make a more precise
>> recommendation there, we need to know more about your data.
>>
>> Best,
>> Phillip
>>
>>
>>
>>
>>
>> >
>> > Best,
>> > Souheyla
>> >
>> > On Wed, 20 Mar 2019 at 17:02, Phillip Alday
>> <phillip.alday using mpi.nl <mailto:phillip.alday using mpi.nl>
>> > <mailto:phillip.alday using mpi.nl <mailto:phillip.alday using mpi.nl>>>
>> wrote:
>> >
>> > Generally speaking for the parameterization of
>> mixed-effects models in
>> > lme4/brms/the usual packages, it doesn't make sense to
>> have a varying
>> > slope (e.g. Time|Subject) without the corresponding
>> fixed effect. This
>> > is because the varying slopes are calculated as offsets
>> from the group
>> > mean, i.e from the fixed effect estimate. Not doing
>> including the fixed
>> > effect is equivalent to assuming the group mean is zero,
>> which is
>> > usually not the assumption you want to make.
>> >
>> > If you fit models with random slopes without the
>> corresponding fixed
>> > effects, then there are two main problems:
>> >
>> > 1. The corresponding variance parameter will be
>> mis-estimated because it
>> > will be the average squared distance to zero and not the
>> average squared
>> > distance to the mean (and average squared distance to
>> the mean is the
>> > definition of variance).
>> >
>> > 2. The model may not converge because the numerics are
>> set up under the
>> > "zero mean" assumption. For lme4/nlme, this is the case,
>> but I believe
>> > that brms may do some internal reparameterization that
>> may avoid these
>> > difficulties. (And a model fit with MCMC (brms) may not
>> have the same
>> > numerical issues as a model fit with MLE (lme4)).
>> >
>> > In brief: just add time as a fixed effect.
>> >
>> > Also: why not fit your model as a continuous model with
>> pre vs. post as
>> > a contrast in the model rather reducing a continuous
>> variable to a
>> > category? You can still apply a categorical distinction
>> afterwards if
>> > you so desire, but in my experience, it's best to defer
>> making things
>> > categorical until as late as possible (see also Frank
>> Harrel's comments
>> > on prediction vs. classification:
>> > http://www.fharrell.com/post/classification/). Moreover,
>> it's a lot
>> > easier to fit a continuous model than a multinomial one ....
>> >
>> > Best,
>> > Phillip
>> >
>> > On 18/3/19 7:11 pm, Souheyla GHEBGHOUB wrote:
>> > > I have *Change* from Pretest to Posttest (gain,
>> no_gain, decline)
>> > as the
>> > > DV. Also *Pretest* and *Group* as covariates. This
>> called for a
>> > multinomial
>> > > regression:
>> > >
>> > > mod0 <- brm(Change ~ Pretest + Group)
>> > >
>> > > *Question: *I'd like to add random effects of
>> *Subject* and
>> > *Word*, which
>> > > may differ by time, but I don't have effect of *Time*
>> to do:
>> > >
>> > > mod1 <- brm(Change ~ Pretest + Group + (Time|Subject)
>> + (Time|Word))
>> > >
>> > > So I thought of this:
>> > >
>> > > mod2 <- brm(Change ~ Pretest + Group + (1|Subject) +
>> (1|Word))
>> > >
>> > > but this also seems wrong to me. What do you think is
>> the best way
>> > to treat
>> > > random effects in this situation, please?
>> > >
>> > > Thank you
>> > >
>> > > Souheyla Ghebghoub
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > _______________________________________________
>> > > R-sig-mixed-models using r-project.org
>> <mailto:R-sig-mixed-models using r-project.org>
>> > <mailto:R-sig-mixed-models using r-project.org
>> <mailto:R-sig-mixed-models using r-project.org>> mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> > >
>> >
>>
More information about the R-sig-mixed-models
mailing list