[R-sig-ME] [EXT] Re: AW: Re: Too high condition R-square value - beta family
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Tue Nov 29 23:39:00 CET 2022
Now that R manuals have nicer-rendering LaTeX components maybe I'll
bother to write a more complete description under the "Details" section
in ?family.glmmTMB ...
On 2022-11-29 5:36 PM, Daniel Lüdecke wrote:
>> you're asking the same question here!)
> 😊 My guess is I wanted to dig a bit deeper into the topic to gain a better understanding of that issue, but finding the time to do so is crucial, and then it's forgotten.
>
> -----Ursprüngliche Nachricht-----
> Von: Ben Bolker <bbolker using gmail.com>
> Gesendet: Dienstag, 29. November 2022 23:14
> An: Daniel Lüdecke <d.luedecke using uke.de>; r-sig-mixed-models using r-project.org
> Betreff: [EXT] Re: AW: Re: [R-sig-ME] Too high condition R-square value - beta family
>
> This gets tricky (and possibly farther into the weeds than the OP is
> interested in).
>
> tl;dr provided everyone is using the right components of the model
> output in the right places, these two different definitions don't
> necessarily represent a problem.
>
> The $variance component of 'family' objects in R (as produced by
> functions such as gaussian(), Gamma(), etc.) gives only the component of
> the variance that depends on the mean: for example,
> gaussian()$variance() returns a vector of all 1s. (The reason for this
> goes back to the classical definitions of generalized linear models,
> where the dispersion parameter [the scaling factor of the variance that
> is *independent* of the mean] is a nuisance parameter that can be
> ignored for many purposes.) If you want the conditional variance of a
> prediction, you typically need to multiply the $variance() output by a
> dispersion value (you can get this by running sigma() on the model,
> although for glmmTMB families you need to check `?sigma.glmmTMB`: in the
> case of the Beta family I think you need
> $variance(predicted_mu)/(1+sigma(fitted_model)).
>
>
> More discussion:
>
> * https://github.com/glmmTMB/glmmTMB/issues/294
>
> * https://github.com/glmmTMB/glmmTMB/issues/169#issuecomment-676086686
> (you're asking the same question here!)
>
>
> On 2022-11-29 4:50 PM, Daniel Lüdecke wrote:
>> It can be that the calculation of the random effects variances is not accurate. The code in the *insight* package (which is used by performance::r2()) has "mu * (1 - mu) / (1 + phi)" to calculate the distributional variance; glmmTMB::beta_family()$variance, however, returns "mu * (1 - mu)". The docs in ?glmmTMB::beta_family, again, say: "Beta distribution: parameterization of Ferrari and Cribari-Neto (2004) and the betareg package (Cribari-Neto and Zeileis 2010); V=μ(1−μ)/(ϕ+1)" (which is what is used in *insight*).
>>
>> I'm not sure that this is the issue, but it might be. Would be good to know which of the two formulas is the correct / more accurate one.
>>
>> -----Ursprüngliche Nachricht-----
>> Von: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> Im Auftrag von Ben Bolker
>> Gesendet: Dienstag, 29. November 2022 22:01
>> An: r-sig-mixed-models using r-project.org
>> Betreff: [EXT] Re: [R-sig-ME] Too high condition R-square value - beta family
>>
>> Thanks. Can you please post the results of summary() applied to
>> your fitted model? That could give us some more clues ...
>>
>> On 2022-11-29 3:41 PM, camille.montalcini using unibe.ch wrote:
>>> Dear list members,
>>>
>>> I am using glmmTMB to fit a beta family (with log link) to some proportion data (varying from 0-1, which I rescaled from 0.01 to 0.99). I have two continuous rescaled predictors (including a time variable) and a binary treatment predictor. My only goal is to assess if there is any treatment effect (i.e. not to make predictions, so maybe overfitting is less of an issue here). As random effect I have my individuals ID (~160 individuals, and around 28 observations per individuals). The model fits reasonably well, but the main issue is that I get a very high conditional R-square: 0.986 (from: performance::r2(fit)) (marginal: 0.034) with the warning: "mu of 0.6 is too close to zero, estimate of random effect variances may be unreliable".
>>>
>>> I tried many thing, including checking if the model is singular (performance::check_singularity())) and it appeared not to be, removing the fixed effects does not change anything either, shuffling the individualsID lead too conditional R-squared around 0.25, removing hens with random intercept mode in the extreme did not change anything either (though model fits generally better). Visualising the data, reveals the individuals to be indeed quite consistent, but likely not up to the level that we could explain 98.7% of the variance, so I am quite confident the model is not reliable. Its the first time I am using beta regression and I feel that I am missing an important point here, any insight would be greatly appreciated!
>>>
>>> Best,
>>> Camille
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> --
>>
>> _____________________________________________________________________
>>
>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Joachim Prölß, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel
>> _____________________________________________________________________
>>
>> SAVE PAPER - THINK BEFORE PRINTING
> --
>
> _____________________________________________________________________
>
> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Joachim Prölß, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel
> _____________________________________________________________________
>
> SAVE PAPER - THINK BEFORE PRINTING
More information about the R-sig-mixed-models
mailing list