[R-sig-ME] location-scale models in nlme

Mon Jan 15 00:59:51 CET 2024

    From ?nlme::varExp (I think there are similar explanations in docs 
for other functions):

For the `form` argument

 > The variance covariate must evaluate to a numeric vector and may involve
expressions using ‘"."’, representing a fitted model object
from which fitted values (‘fitted(.)’) and residuals
(‘resid(.)’) can be extracted (this allows the variance
covariate to be updated during the optimization of an object
function).

...

 > Defaults to ‘~ fitted(.)’ representing a variance covariate given by 
the fitted values of a fitted model object and no grouping factor.

   So yes, it's not relevant in your case.

On 2024-01-14 6:56 p.m., Simon Harmel wrote:
> Thanks so much, Ben! I didn't know that nlme uses "fitted value 
> estimates" as predictors!! I know I'm being a pain, but could you please 
> briefly help me contextualize this in the model below?  Or maybe the 
> MODEL below doesn't use "fitted value estimates" as X1_categorical + 
> X2_numeric are both external covariates?
> 
> 
> MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
>           random = ~1| subject,
>           data = data,
>           correlation = corSymm(~1|subject),
>           weights = varComb(varIdent(form = ~ 1 |  X1_categorical ),
>                                            varPower(form = ~  X2_numeric )))
> 
> On Sun, Jan 14, 2024 at 5:06 PM Ben Bolker <bbolker using gmail.com 
> <mailto:bbolker using gmail.com>> wrote:
> 
>          I think it's pretty close to a toss-up.
> 
>         The main advantage of the specific implementation found in nlme is
>     that it allows the *fitted value estimate* to be used as a predictor
>     (i.e., specifically implementing a dispersion model based on the (log)
>     mean rather than on other covariates.  I would guess that most
>     implementations of the log-scale approach do not allow this, i.e. they
>     only work with external/previously available covariates.
> 
>         If one allowed "fitted value" and "log of fitted value" as
>     predictors, then most of the variance models available in lme could
>     also
>     be specified in the log-link framework.
> 
>         I have been thinking about implementing something like this in
>     glmmTMB for a long time but haven't gotten around to it ...
>     https://github.com/glmmTMB/glmmTMB/issues/125
>     <https://github.com/glmmTMB/glmmTMB/issues/125>
> 
>     On 2024-01-14 4:47 p.m., Simon Harmel wrote:
>      > When I say "more preferable", I mean for instance, in terms of
>      > flexibility, and generality (e.g., approach b subsuming approach
>     a or
>      > vice versa).
>      >
>      > On Sun, Jan 14, 2024 at 3:44 PM Simon Harmel
>     <sim.harmel using gmail.com <mailto:sim.harmel using gmail.com>
>      > <mailto:sim.harmel using gmail.com <mailto:sim.harmel using gmail.com>>> wrote:
>      >
>      >     Thanks so much, Ben. I conclude that the users of lme() don't
>     need
>      >     to convert the output for "varIdent()" or "varPower()" back
>     due to a
>      >     link function to get the estimates of the relevant Level-1
>     residual
>      >     variances. This is because the former gives out the proportions
>      >     between residual variances with respect to a reference level in a
>      >     categorical variable, and the latter gives out "t", which the
>     user
>      >     can insert in: (sigma(MODEL)^2)*abs(data$X2_numeric)^(2*t) to
>     obtain
>      >     the relationship between X2_numeric and the residual variance.
>      >
>      >     Ben, as both a mathematical and applied expert, which
>     location-scale
>      >     approach, do you think, is more preferable? The one
>     implemented in
>      >     nlme() or the one that allows modeling the scale using:
>      >     log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p
>     predictors of
>      >     scale) ??
>      >
>      >     Thank you so very much for sharing your perspective,
>      >     Simon
>      >
>      >
>      >
>      >
>      >
>      >     On Sun, Jan 14, 2024 at 2:00 PM Ben Bolker <bbolker using gmail.com
>     <mailto:bbolker using gmail.com>
>      >     <mailto:bbolker using gmail.com <mailto:bbolker using gmail.com>>> wrote:
>      >
>      >              For varIdent (from ?nlme::varIdent),
>      >
>      >            For identifiability reasons, the
>      >                coefficients of the variance function represent the
>      >         ratios between
>      >                the variances and a reference variance
>     (corresponding to a
>      >                reference group level).
>      >
>      >             I assume that this is internally parameterized via
>     something
>      >         like (1)
>      >         a model matrix constructed with ~ <grouping factor> and (2) a
>      >         log link,
>      >         to ensure that the ratios are all positive
>      >
>      >            For varPower,
>      >
>      >         s2(v) = |v|^(2*t)
>      >
>      >            -- notice it uses the absolute value of the covariate. So
>      >         that term
>      >         will also be positive.
>      >
>      >         varComb uses a product; the product of two positive
>     values will
>      >         also be
>      >         positive ...
>      >
>      >         On 2024-01-14 11:40 a.m., Simon Harmel wrote:
>      >          > Dear Ben and List Members,
>      >          >
>      >          > I'm following up on this
>      >          >
>      >       
>       (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html> <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>>
>      >          >
>      >       
>       <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html> <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>>>)
>      >          > thread. There, Ben noted that my MODEL (below)
>     qualifies as a
>      >          > "location-scale" model.
>      >          >
>      >          > Q: Usually for the scale part of a location-scale
>     model, the
>      >         linear
>      >          > model uses a log link to guarantee that the estimate
>     of scale
>      >         is positive:
>      >          >
>      >          > log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p
>      >         predictors of scale)
>      >          >
>      >          > But in the MODEL that I sketched below, how such a
>      >         guarantee is made?
>      >          >
>      >          > Thanks, Simon
>      >          > MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
>      >          >           random = ~1| subject,
>      >          >           data = data,
>      >          >           correlation = corSymm(~1|subject),
>      >          >           weights = varComb(varIdent(form = ~ 1 |
>      >         X1_categorical ),
>      >          >                                           
>     varPower(form = ~
>      >         X2_numeric )))
>      >          >
>      >
>      >         --
>      >         Dr. Benjamin Bolker
>      >         Professor, Mathematics & Statistics and Biology, McMaster
>     University
>      >         Director, School of Computational Science and Engineering
>      >         (Acting) Graduate chair, Mathematics & Statistics
>      >           > E-mail is sent at my convenience; I don't expect replies
>      >         outside of
>      >         working hours.
>      >
> 
>     -- 
>     Dr. Benjamin Bolker
>     Professor, Mathematics & Statistics and Biology, McMaster University
>     Director, School of Computational Science and Engineering
>     (Acting) Graduate chair, Mathematics & Statistics
>       > E-mail is sent at my convenience; I don't expect replies outside of
>     working hours.
> 

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.