[R-sig-ME] Interpreting GLMM output and is this the right model?

Tue Dec 8 14:31:21 CET 2020

Hi Gabriella,
looks like your residuals are not behaving well, so I would not trust your model output.
I have no idea about your data, and there are many people more competent than me in this list, who can give you useful advice. At least you know that your starting point needs to be redefined =) It might be that you actually need to change conditional distribution and pick one that can handle skewed responses in linear mixed models(?).

Try have look here: https://cran.r-project.org/web/packages/skewlmm/skewlmm.pdf
(You can fit the same model and look again at the residuals)

Alternatively, a similar model could be fitted with the package ‘brms’, in a Bayesian framework:

library(brms)
mod.1.brm <- brm(sentiment ~ primetype + age + (1|id), data=DF, family=skew_normal()) # get a coffee, it might take a while...
predict.mod.1.brm <- t(posterior_predict(mod.1.brm)) # extract posterior predictions to feed to DHARMa
library(DHARMa)
sim.resid.mod.1.brm <- createDHARMa(simulatedResponse = predict.mod.1.brm, observedResponse = DF$sentiment)
plot(sim.resid.mod.1.brm) # other stuff should be inspected, but this should allow you to preliminary check if the distributional assumption is ok
library(parameters)
parameters(mod.1.brm, ci=0.95, digits=3, ci_digits=3)

L.
Il 8 dic 2020, 13:47 +0100, Gabriella Kountourides <gabriella.kountourides using sjc.ox.ac.uk>, ha scritto:
> Dear Luca,
>
> Many thanks!
>
> That code worked well, output is attached, I'm not sure how to interpret it, but the KS test p=0 (devation is sig), the dispersion test p=0.992(devation is not sig) and but the outlier test p=0 (devation is sig)
>
> Thank you for your well explained answer, it is all much clearer to me, I really appreciate it!
> From: Boku mail <luca.corlatti using boku.ac.at>
> Sent: 07 December 2020 18:07
> To: r-sig-mixed-models <r-sig-mixed-models using r-project.org>; Gabriella Kountourides <gabriella.kountourides using sjc.ox.ac.uk>
> Subject: Re: [R-sig-ME] Interpreting GLMM output and is this the right model?
>
> Dear Gabriella,
> A few thoughts:
>
> 1) generally speaking, it doesn’t make much sense to look at the distribution of the raw response. The choice of the ‘family’ argument is rather  based on the conditional distribution, i.e., the distribution of the response across the fitted line (or, in other words, the distribution of the response after accounting for the linear predictor). In this respect, I guess it’s hard to say whether the skewness you mention may be a problem or not (same for the zeros).
>
> 2) you fitted a mixed model with Gaussian conditional distribution. Whether this is a ‘good model’ or not, is hard to say (e.g., does your linear predictor include the *supposedly* important explanatory variables, in the correct form? etc.), but at least I would inspect the residuals’ behavior. This would allow you at least to check if the model is not grossly wrong. Suppose your model name is “mod.1”, then you could do:
>
> library(DHARMa)
> sim.mod.1 <- simulateResiduals(mod.1)
> plot(sim.mod.1)
>
> Or, alternatively:
> library(performance)
> check_model(mod.1) # but DHARMa would be handier if you decided to change conditional distribution, and fit a GLMM
>
> If the model is not grossly wrong, residuals should be distributed in an unsystematic way. If you have weird patterns, then you’re probably off-track. If that happens, you should inspect what’s wrong in your model (conditional distribution? Missing variables? etc.)
>
> 3) once you’re confident your model is ‘well behaved’, then you can inspect the results. What your current summary says, basically, is that after accounting for primetype, if you increase age by 1 unit, you’ll have a decrease  of -0.0020644 in sentiment, and this decrease is statistically significant (basically, Estimate/ Std. Error  = z value; the z-score measures how far [in standard errors] your estimate is from zero; z-score follows a standard normal distribution, hence a value of z < -1.96 or z > 1.96 will be statistically significant at an alpha = 0.5). Whether this is also biologically significant, up to you to say!
> The same reasoning applies to the primetype levels, except that this is a categorical variable, so primetype2 will be 0.0907564 ’sentiment scores’ higher than the baseline primetype1 (the intercept).
> Should you be interested in comparing different primetype levels in a pairwise manner:
>
> library(emmeans)
> emmeans(mod.1, ~primetype) # haven’t used this in a while, so cross-check in the package :-)
>
> Result interpretation is straightforward in the case of a LMM, which uses an identity link function; should you use a different link, then things would be tricker to interpret, and plotting the marginal effects would be wise:
>
> library(visreg)
> visreg(mod.1, “primetype", scale=“response")
>
> However, I guess the crucial step for you will be to inspect the behavior of the model in the first place.
>
> Hope this helps!
> Luca
>
>
>
>
> Il 7 dic 2020, 18:10 +0100, Gabriella Kountourides <gabriella.kountourides using sjc.ox.ac.uk>, ha scritto:
> > Hi everyone,
> >
> > I emailed a few weeks ago, but am still struggling with this data.
> > The description of the question below, and model/code/output at the bottom. Many thanks for reading.
> >
> >
> > I want to look at whether there is a relationship between the way a question is asked (positive, negative, neutral wording) and the sentiment of the response. I have 2638 people asked a question about symptoms. 1/3 of the people were asked it with a negative wording, 1/3 with a neutral one, 1/3 with a positive one. From this, I did sentiment analysis (using Trincker's package) to see whether their responses were more positive or negative, depending on the wording of the question.
> > Sentiment analysis breaks down responses into sentences, so I have 2638 people, but 7924 sentences, so I would assume to fit ID as a random effect.
> >
> > The big question is: does the way the question is asked (primetype) affect the polarity/sentiment of the response?
> > My data is negatively skewed, and has a lot of 0s (this is because some people felt 'neutral' and so they scored '0'.
> >
> > Model using the dataframe DF, to see how primetype (this is the way the question is asked) predicts sentiment (the polarity score, which is negatively skewed with lots of 0s), fixed effect is age, and random effect is ID
> >
> > ```
> > glmmTMB(sentiment ~ primetype + age + (1|id), data=DF)
> > ```
> >
> >
> > Output:
> >
> > ```
> > Family: gaussian ( identity )
> > Formula: sentiment ~ primetype + age + (1 | id)
> > Data: DF
> >
> > AIC BIC logLik deviance df.resid
> > 7254.9 7296.5 -3621.4 7242.9 7556
> >
> > Random effects:
> >
> > Conditional model:
> > Groups Name Variance Std.Dev.
> > id (Intercept) 8.732e-11 9.344e-06
> > Residual 1.526e-01 3.906e-01
> > Number of obs: 7562, groups: id, 2520
> >
> > Dispersion estimate for gaussian family (sigma^2): 0.153
> >
> > Conditional model:
> > Estimate Std. Error z value Pr(>|z|)
> > (Intercept). -0.1655972 0.0204310 -8.105 5.27e-16 ***
> > primetype2 0.0907564 0.0114045 7.958 1.75e-15 ***
> > primetype3 0.0977533 0.0115802 8.441 < 2e-16 ***
> > age -0.0020644 0.0006483 -3.184 0.00145 **
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > >
> > ```
> > How can I interpret whether the model is a good one for my data, is there something else I should be doing? I'm not sure how to interpret the output at all. Would be immensely grateful for any insight
> >
> >
> > Thanks all
> >
> >
> > Gabriella Kountourides
> >
> > DPhil Student | Department of Anthropology
> >
> > Evolutionary Medicine and Public Health Group
> >
> > St. John’s College, University of Oxford
> >
> > gabriella.kountourides using sjc.ox.ac.uk
> >
> > Tweet me: https://twitter.com/GKountourides
> >
> > ________________________________
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]