[R-sig-ME] Negative binomial GLMM model/variables selection based in marginal R2 and conditional R2

Wed Jul 17 17:28:13 CEST 2019

I am currently out of the office until July 5th. I will respond to your email upon my return.

On Jul 17, 2019, at 1:46 AM, Thierry Onkelinx via R-sig-mixed-models <r-sig-mixed-models using r-project.org> wrote:

> Dear Alexandre,
> 
> IMHO the full model of your analysis should be based upon the design of
> your study, not on any goodness-of-fit measurement. Having said that, both
> random effects variables have only 4 levels. That is too few to get a
> descent variance estimation. I'd recommend to consider both as fixed
> effects.
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> <https://www.inbo.be>
> 
> 
> Op do 11 jul. 2019 om 23:47 schreef ASANTOS <alexandresantosbr using yahoo.com.br
> :
> 
> Dear R-Mixed-Models Members,
> 
> ?????? ?????? I've like to chose my negative binomial GLMM better
> model/variables based in marginal R2 (variance explained by the fixed
> factor(s)) and conditional R2 (variance explained by both the fixed and
> random factors), but some times I have a great dissimilarities in this
> values, if I have gain in the conditional R2, my marginal R2 is poor and
> vice-versa (I make a little exercise by changes in the position on fixed
> and random effects in the models). In my example:
> 
> *A) Model 1 - Inf_Leaves ~ Inf_YST + Age_months + (1 | Trat) - balance
> values between marginal and conditional R2*
> 
> R2m R2c
> 
> delta 0.4282151 0.5203953
> 
> lognormal 0.5090799 0.6186677
> 
> trigamma 0.3153259 0.3832049
> 
> 
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
> 
> Family: Negative Binomial(0.9207)?? ( log )
> 
> Formula: Inf_Leaves ~ Inf_YST + Age_months + (1 | Trat)
> 
> ???? Data: d3
> 
> ???????? AIC?????????? BIC???? logLik deviance df.resid
> 
> ????4500.6???? 4521.9?? -2245.3???? 4490.6?????????? 519
> 
> Scaled residuals:
> 
> Min?????????? 1Q?? Median?????????? 3Q???????? Max
> 
> -0.9413 -0.7254 -0.4113?? 0.5294?? 7.2853
> 
> Random effects:
> 
> Groups Name?????????????? Variance Std.Dev.
> 
> Trat???? (Intercept) 0.2176 ????0.4664
> 
> Number of obs: 524, groups:?? Trat, 4
> 
> Fixed effects:
> 
> ?????????????????????????? Estimate Std. Error z value Pr(>|z|)
> 
> (Intercept)?? 0.2847245?? 0.2913635???? 0.977 0.328
> 
> Inf_YST???????? -0.0016482?? 0.0003483?? -4.732 2.22e-06 ***
> 
> Age_months???? 0.3144764?? 0.0183616?? 17.127?? < 2e-16 ***
> 
> ---
> 
> Signif. codes:?? 0 ???***??? 0.001 ???**??? 0.01 ???*??? 0.05 ???.??? 0.1
> ??? ??? 1
> 
> Correlation of Fixed Effects:
> 
> ???????????????????? (Intr) In_YST
> 
> Inf_YST???????? 0.171
> 
> Age_months -0.558 -0.532
> 
> convergence code: 0
> 
> Model failed to converge with max|grad| = 0.00631137 (tol = 0.001,
> component 1)
> 
> Model is nearly unidentifiable: very large eigenvalue
> 
> - Rescale variables?
> 
> Model is nearly unidentifiable: large eigenvalue ratio
> 
> - Rescale variables?
> 
> 
> *B) Model 2 -?? Inf_Leaves ~ Inf_YST + Trat + (1 | Age_months) - a better
> conditional but poor marginal R2*
> 
> R2m R2c
> 
> delta???????? 0.1626844 0.7257397
> 
> lognormal 0.1725712 0.7698453
> 
> trigamma?? 0.1489258 0.6643626
> 
> 
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
> 
> Family: Negative Binomial(1.8431)?? ( log )
> 
> Formula: Inf_Leaves ~ Inf_YST + Trat + (1 | Age_months)
> 
> ???? Data: d3
> 
> ???????? AIC?????????? BIC logLik deviance df.resid
> 
> ????4121.5???? 4151.4 -2053.8???? 4107.5?????????? 517
> 
> Scaled residuals:
> 
> Min?????????? 1Q?? Median?????????? 3Q???????? Max
> 
> -1.2776 -0.6703 -0.1486?? 0.3279?? 5.4019
> 
> Random effects:
> 
> Groups Name?????????????? Variance Std.Dev.
> 
> Age_months (Intercept) 1.172?????? 1.083
> 
> Number of obs: 524, groups:?? Age_months, 4
> 
> Fixed effects:
> 
> Estimate Std. Error z value Pr(>|z|)
> 
> (Intercept)???????????????? 3.4859551 0.5492043???? 6.347 2.19e-10 ***
> 
> Inf_YST???????????????????????? 0.0005702 0.0002864???? 1.991???? 0.0465 *
> 
> TratC1-Insecticide -1.1081610 0.1012478 -10.945?? < 2e-16 ***
> 
> TratC2-Control???????? -0.7859302 0.1058146?? -7.427 1.11e-13 ***
> 
> TratC2-Insecticide -1.3833545 0.1041882 -13.277?? < 2e-16 ***
> 
> ---
> 
> Signif. codes:?? 0 ???***??? 0.001 ???**??? 0.01 ???*??? 0.05 ???.??? 0.1
> ??? ??? 1
> 
> Correlation of Fixed Effects:
> 
> ?????????????????????? (Intr) In_YST TrC1-I TrC2-C
> 
> Inf_YST???????? -0.122
> 
> TrtC1-Insct -0.103 0.189
> 
> TrtC2-Cntrl -0.104 0.265?? 0.436
> 
> TrtC2-Insct -0.097 0.221?? 0.424?? 0.504
> 
> convergence code: 0
> 
> Model failed to converge with max|grad| = 0.00398879 (tol = 0.001,
> component 1)
> 
> Model is nearly unidentifiable: very large eigenvalue
> 
> - Rescale variables?
> 
> Model is nearly unidentifiable: large eigenvalue ratio
> 
> - Rescale variables?
> 
> 
> And my questions are:
> 
> 1) Marginal R2 is a good metric for identify a bad fixed effect choose
> in my models B? Despite a better conditional R2 comparing of conditional
> R2 in my model A.
> 
> 2) If I'm sure about my fixed and random effects, it is better a final
> model with high values in both R2 or I choose based in the high value in
> conditional R2?
> 
> 
> Thanks in advanced,
> 
> 
> Alexandre
> 
> 
> --
> ======================================================================
> Alexandre dos Santos
> Prote????o Florestal
> IFMT - Instituto Federal de Educa????o, Ci??ncia e Tecnologia de Mato
> Grosso
> Campus C??ceres
> Caixa Postal 244
> Avenida dos Ramires, s/n
> Bairro: Distrito Industrial
> C??ceres - MT                      CEP: 78.200-000
> Fone: (+55) 65 99686-6970 (VIVO) (+55) 65 3221-2674 (FIXO)
> 
>         alexandre.santos using cas.ifmt.edu.br
> Lattes: http://lattes.cnpq.br/1360403201088680
> OrcID: orcid.org/0000-0001-8232-6722
> Researchgate: www.researchgate.net/profile/Alexandre_Santos10
> LinkedIn: br.linkedin.com/in/alexandre-dos-santos-87961635
> Mendeley:www.mendeley.com/profiles/alexandre-dos-santos6/
> ======================================================================
> 
> 
>        [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative text/enriched version deleted]]