[R-sig-ME] location-scale models in nlme

Mon Jan 15 00:06:15 CET 2024

    I think it's pretty close to a toss-up.

   The main advantage of the specific implementation found in nlme is 
that it allows the *fitted value estimate* to be used as a predictor 
(i.e., specifically implementing a dispersion model based on the (log) 
mean rather than on other covariates.  I would guess that most 
implementations of the log-scale approach do not allow this, i.e. they 
only work with external/previously available covariates.

   If one allowed "fitted value" and "log of fitted value" as 
predictors, then most of the variance models available in lme could also 
be specified in the log-link framework.

   I have been thinking about implementing something like this in 
glmmTMB for a long time but haven't gotten around to it ... 
https://github.com/glmmTMB/glmmTMB/issues/125

On 2024-01-14 4:47 p.m., Simon Harmel wrote:
> When I say "more preferable", I mean for instance, in terms of 
> flexibility, and generality (e.g., approach b subsuming approach a or 
> vice versa).
> 
> On Sun, Jan 14, 2024 at 3:44 PM Simon Harmel <sim.harmel using gmail.com 
> <mailto:sim.harmel using gmail.com>> wrote:
> 
>     Thanks so much, Ben. I conclude that the users of lme() don't need
>     to convert the output for "varIdent()" or "varPower()" back due to a
>     link function to get the estimates of the relevant Level-1 residual
>     variances. This is because the former gives out the proportions
>     between residual variances with respect to a reference level in a
>     categorical variable, and the latter gives out "t", which the user
>     can insert in: (sigma(MODEL)^2)*abs(data$X2_numeric)^(2*t) to obtain
>     the relationship between X2_numeric and the residual variance.
> 
>     Ben, as both a mathematical and applied expert, which location-scale
>     approach, do you think, is more preferable? The one implemented in
>     nlme() or the one that allows modeling the scale using: 
>     log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p predictors of
>     scale) ??
> 
>     Thank you so very much for sharing your perspective,
>     Simon
> 
> 
> 
> 
> 
>     On Sun, Jan 14, 2024 at 2:00 PM Ben Bolker <bbolker using gmail.com
>     <mailto:bbolker using gmail.com>> wrote:
> 
>              For varIdent (from ?nlme::varIdent),
> 
>            For identifiability reasons, the
>                coefficients of the variance function represent the
>         ratios between
>                the variances and a reference variance (corresponding to a
>                reference group level).
> 
>             I assume that this is internally parameterized via something
>         like (1)
>         a model matrix constructed with ~ <grouping factor> and (2) a
>         log link,
>         to ensure that the ratios are all positive
> 
>            For varPower,
> 
>         s2(v) = |v|^(2*t)
> 
>            -- notice it uses the absolute value of the covariate. So
>         that term
>         will also be positive.
> 
>         varComb uses a product; the product of two positive values will
>         also be
>         positive ...
> 
>         On 2024-01-14 11:40 a.m., Simon Harmel wrote:
>          > Dear Ben and List Members,
>          >
>          > I'm following up on this
>          >
>         (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>
>          >
>         <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>>)
>          > thread. There, Ben noted that my MODEL (below) qualifies as a
>          > "location-scale" model.
>          >
>          > Q: Usually for the scale part of a location-scale model, the
>         linear
>          > model uses a log link to guarantee that the estimate of scale
>         is positive:
>          >
>          > log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p
>         predictors of scale)
>          >
>          > But in the MODEL that I sketched below, how such a
>         guarantee is made?
>          >
>          > Thanks, Simon
>          > MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
>          >           random = ~1| subject,
>          >           data = data,
>          >           correlation = corSymm(~1|subject),
>          >           weights = varComb(varIdent(form = ~ 1 | 
>         X1_categorical ),
>          >                                            varPower(form = ~ 
>         X2_numeric )))
>          >
> 
>         -- 
>         Dr. Benjamin Bolker
>         Professor, Mathematics & Statistics and Biology, McMaster University
>         Director, School of Computational Science and Engineering
>         (Acting) Graduate chair, Mathematics & Statistics
>           > E-mail is sent at my convenience; I don't expect replies
>         outside of
>         working hours.
> 

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.