[R-sig-ME] Variable selection for varying dispersion beta glmm using glmmTMB package

Thu Jun 3 04:09:05 CEST 2021

Look first in the help pages (?DHARMa etc) and vignettes for
the DHARMa package.  After that, I am not sure what to suggest.
Others may have suggestions.

You will be lucky to get a perfect fit.  At the end of the day, the
question is whether such differences as are apparent matter,
for the purpose for which you intend to use the model.  A useful
tack is to simulate from the fitted model, fit to that model, and
check what difference it makes for the purpose for which the
model is used.  If there is little difference, the deviations from
the model probably do not much matter.  Maybe, repeat several
times.

Maybe you need to include degree 2 term(s) in your dispformula.
Try, maybe, a degree 2 normal spline (this may give less wiggle
at the extremes, and more flexibility of shape in the midrange
region) or a degree 2 or even 3 orthogonal polynomial [use poly()].

John Maindonald             email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>

On 3/06/2021, at 10:33, Tahsin Ferdous <tahsinferdousuofc using gmail.com<mailto:tahsinferdousuofc using gmail.com>> wrote:

Hi all,

I am struggling to interpret the residual plots from the Dharma package. If we find a red line in residual plot,does it mean there is heteroscedasticity in the model for the predictor variables? If the solid line matches with the dashed line, can we say there is no heteroscedasticity? I have attached three residual plots here to understand heteroscedasticity of the model.  In the first plot, quantile deviationare detected by the red line, so there is heteroscedasticity in the model. This is for the model which includes all covariates. Then I created the residual plot for one by one covariate to know which predictors are responsible for variable dispersion. The 2nd and 3rd plots are for just one predictor. In the 2nd plot, three solid lines are red and there exhibits a clear deviation from the dashed line. So, there is heteroscedasticity in the model for that predictor. The 3rd plot is box plot.The distribution for each factor level should be uniformly distributed, so the box should go from 0.25 to 0.75, with the median line at 0.5 (within-group ). As the two box plots are red and it shows deviation of median line from 0.5, so there is heteroscedasticity in the model for the predictor. The 4th plot shows less deviation. Can we say this is better? I need your expert suggestions and also please refer me to any article where I find a clear explanation of heteroscedasticity checking by residual plot using DHARMA.Many thanks.

Kindest regards,

Tahsin

On Tue, Jun 1, 2021 at 4:14 PM Tahsin Ferdous <tahsinferdousuofc using gmail.com<mailto:tahsinferdousuofc using gmail.com>> wrote:
Thanks John.

On Tue, Jun 1, 2021 at 3:11 PM John Maindonald <john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>> wrote:
No, I was not suggesting that.  I’d stick with the checks done
using simulateResiduals() and plotResiduals() from DHARMa.
The parameter `form` allows you to specify an explanatory
variable against whose values you can plot the simulated
residuals.
John Maindonald             email: john.maindonald using anu.edu.a<mailto:john.maindonald using anu.edu.a>

On 2/06/2021, at 05:07, Tahsin Ferdous <tahsinferdousuofc using gmail.com<mailto:tahsinferdousuofc using gmail.com>> wrote:

Hi John,

Thanks for your clarification. Are you suggesting doing the Breusch-Pagan Test without the random effects for glmm?

Best,

Tahsin

On Fri, May 28, 2021 at 4:13 PM John Maindonald <john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>> wrote:
The Breusch-Pagan Test, as implemented in lmtest, is designed for
lm models with independent normal errors.   You have a random
effects term — surely that invalidates use of this test.  Additionally,
I doubt that a normal distribution is a good enough approximation
to beta that, even without the random effects term, results from
lmtest() are valid.

John Maindonald             email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>

On 27/05/2021, at 13:01, Tahsin Ferdous <tahsinferdousuofc using gmail.com<mailto:tahsinferdousuofc using gmail.com>> wrote:

I am struggling with the varying dispersion beta regression using glmmTMB.
I did the Breusch-Pagan Test for checking heteroscedasticity for my model.
As, the p-value is smaller than 0.05, so heterodasticity is present. So, I
have to use beta glmm for varying dispersion. Further, I need to know which
variable I should include for a varying dispersion model. To know this, I
followed a procedure. For example, my response variable is y, independent
variable is x1,x2 and x3 and there is random effect for study id. At first,
I ran beta glmm for varying dispersion only for y and x1. Then, I did the
Breusch-Pagan Test for checking heteroscedasticity. If the p value is
smaller than 0.05, there is heteroscadsticity. In this case, I added x1
variable in my dispersion model. Similarly, I run beta glmm for y and x2,
and then perform the Breusch-Pagan test. If the result shows
homoscedasticity, then I didn't include x2 covariate for the dispersion
model. Again, I did the same thing for y and x3. If the result implies
heteroscedasticity, then I added x3 covariate for my dispersion model.

Finally, this will be like :
m1.f <- glmmTMB(y~ x1+x2+x3+(1|study_id), data=mydata, ziformula=
~1,dispformula = ~x1+x3, family=beta_family() )
summary(m1.f)

Is my procedure correct?

Should we comment on only conditional mean model?

Thanks.

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

<Rplot1.png><Rplot 2.png><Rplot 3.png><Rplot 4.png>

	[[alternative HTML version deleted]]