[R-sig-ME] Random effects in multinomial regression in R?

Phillip Alday ph||||p@@|d@y @end|ng |rom mp|@n|
Wed Mar 20 18:56:38 CET 2019


On 20/3/19 6:39 pm, Souheyla GHEBGHOUB wrote:
> Hi Philip, 
> 
> Thank you for the clarification. But I might have not make it clear in
> my question.
> 
> I don't have Time in my data at all because I chose to predict change
> rather than having posttest and pretest responses as DV and Time as
> fixed effect. 

If Time isn't in your difference data, then it really makes no sense to
have it in your model anywhere ....

> I chose this way because I have groups of subjects who were tested on
> words, and I was not too sure whether, a simple regression with
> Responses as DV and Time (Pretest/Posttest) as IV , will take into
> account differences between Pretest and Posttest at the level of each
> word. That is, I don't know whether it will sum the overall pretest
> score of each subject then compare it to its posttest, while I want it
> to compare each subject score of each word from pretest to posttest then
> base its analysis on these score changes. 

I don't want to be too harsh, but if you were unsure about that, then
that's the question you should have asked first. (See also the XY
problem, https://en.wikipedia.org/wiki/XY_problem)

> 
> That's why I did not want to risk it and chose /score change/ as the DV
> instead. But I was faced with another problem which is absence of Time
> effect by which subjects differ for my random slopes?

Assuming you want to compute the difference outside of the model, then
you could (and I would argue should) still use the continuous/numeric
difference and not a categorical thresholding of that difference as your
dependent variable.

In that case, I would argue that there can't be a "Time" effect by
subject because you are measuring the difference, which incorporates the
variance at each Time in the variance of the difference. Same for word.

Depending on the exact structure of the test and whether there are
multiple pretest scores by subject or by word, you could potentially
include that as a random slope, but to make a more precise
recommendation there, we need to know more about your data.

Best,
Phillip





> 
> Best,
> Souheyla 
> 
> On Wed, 20 Mar 2019 at 17:02, Phillip Alday <phillip.alday using mpi.nl
> <mailto:phillip.alday using mpi.nl>> wrote:
> 
>     Generally speaking for the parameterization of mixed-effects models in
>     lme4/brms/the usual packages, it doesn't make sense to have a varying
>     slope (e.g. Time|Subject) without the corresponding fixed effect. This
>     is because the varying slopes are calculated as offsets from the group
>     mean, i.e from the fixed effect estimate. Not doing including the fixed
>     effect is equivalent to assuming the group mean is zero, which is
>     usually not the assumption you want to make.
> 
>     If you fit models with random slopes without the corresponding fixed
>     effects, then there are two main problems:
> 
>     1. The corresponding variance parameter will be mis-estimated because it
>     will be the average squared distance to zero and not the average squared
>     distance to the mean (and average squared distance to the mean is the
>     definition of variance).
> 
>     2. The model may not converge because the numerics are set up under the
>     "zero mean" assumption. For lme4/nlme, this is the case, but I believe
>     that brms may do some internal reparameterization that may avoid these
>     difficulties. (And a model fit with MCMC (brms) may not have the same
>     numerical issues as a model fit with MLE (lme4)).
> 
>     In brief: just add time as a fixed effect.
> 
>     Also: why not fit your model as a continuous model with pre vs. post as
>     a contrast in the model rather reducing a continuous variable to a
>     category? You can still apply a categorical distinction afterwards if
>     you so desire, but in my experience, it's best to defer making things
>     categorical until as late as possible (see also Frank Harrel's comments
>     on prediction vs. classification:
>     http://www.fharrell.com/post/classification/). Moreover, it's a lot
>     easier to fit a continuous model than a multinomial one ....
> 
>     Best,
>     Phillip
> 
>     On 18/3/19 7:11 pm, Souheyla GHEBGHOUB wrote:
>     > I have *Change* from Pretest to Posttest (gain, no_gain, decline)
>     as the
>     > DV. Also *Pretest* and *Group* as covariates. This called for a
>     multinomial
>     > regression:
>     >
>     > mod0 <- brm(Change ~ Pretest + Group)
>     >
>     > *Question: *I'd like to add random effects of *Subject* and
>     *Word*, which
>     > may differ by time, but I don't have effect of *Time* to do:
>     >
>     > mod1 <- brm(Change ~ Pretest + Group + (Time|Subject) + (Time|Word))
>     >
>     > So I thought of this:
>     >
>     > mod2 <- brm(Change ~ Pretest + Group + (1|Subject) + (1|Word))
>     >
>     > but this also seems wrong to me. What do you think is the best way
>     to treat
>     > random effects in this situation, please?
>     >
>     > Thank you
>     >
>     > Souheyla Ghebghoub
>     >
>     >       [[alternative HTML version deleted]]
>     >
>     > _______________________________________________
>     > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     >
>



More information about the R-sig-mixed-models mailing list