[R-sig-ME] location-scale models in nlme

Mon Jan 15 01:25:41 CET 2024

Thank you! That's correct, for three of the variance functions (varExp,
varPower, varConstPower), "form=" defaults to "fitted(.)", which I suppose
acknowledges the power relation between residual variance and the fitted
values (I think this, to some extent, accounts for endogeneity, the
relation between explained and unexplained variance in a DV).

What surprised me in that explanation was that the user can also use "form
= ~ resid(.)", how is this even possible?

On Sun, Jan 14, 2024 at 6:00 PM Ben Bolker <bbolker using gmail.com> wrote:

>     From ?nlme::varExp (I think there are similar explanations in docs
> for other functions):
>
> For the `form` argument
>
>  > The variance covariate must evaluate to a numeric vector and may involve
> expressions using ‘"."’, representing a fitted model object
> from which fitted values (‘fitted(.)’) and residuals
> (‘resid(.)’) can be extracted (this allows the variance
> covariate to be updated during the optimization of an object
> function).
>
> ...
>
>  > Defaults to ‘~ fitted(.)’ representing a variance covariate given by
> the fitted values of a fitted model object and no grouping factor.
>
>    So yes, it's not relevant in your case.
>
>
> On 2024-01-14 6:56 p.m., Simon Harmel wrote:
> > Thanks so much, Ben! I didn't know that nlme uses "fitted value
> > estimates" as predictors!! I know I'm being a pain, but could you please
> > briefly help me contextualize this in the model below?  Or maybe the
> > MODEL below doesn't use "fitted value estimates" as X1_categorical +
> > X2_numeric are both external covariates?
> >
> >
> > MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
> >           random = ~1| subject,
> >           data = data,
> >           correlation = corSymm(~1|subject),
> >           weights = varComb(varIdent(form = ~ 1 |  X1_categorical ),
> >                                            varPower(form = ~  X2_numeric
> )))
> >
> > On Sun, Jan 14, 2024 at 5:06 PM Ben Bolker <bbolker using gmail.com
> > <mailto:bbolker using gmail.com>> wrote:
> >
> >          I think it's pretty close to a toss-up.
> >
> >         The main advantage of the specific implementation found in nlme
> is
> >     that it allows the *fitted value estimate* to be used as a predictor
> >     (i.e., specifically implementing a dispersion model based on the
> (log)
> >     mean rather than on other covariates.  I would guess that most
> >     implementations of the log-scale approach do not allow this, i.e.
> they
> >     only work with external/previously available covariates.
> >
> >         If one allowed "fitted value" and "log of fitted value" as
> >     predictors, then most of the variance models available in lme could
> >     also
> >     be specified in the log-link framework.
> >
> >         I have been thinking about implementing something like this in
> >     glmmTMB for a long time but haven't gotten around to it ...
> >     https://github.com/glmmTMB/glmmTMB/issues/125
> >     <https://github.com/glmmTMB/glmmTMB/issues/125>
> >
> >     On 2024-01-14 4:47 p.m., Simon Harmel wrote:
> >      > When I say "more preferable", I mean for instance, in terms of
> >      > flexibility, and generality (e.g., approach b subsuming approach
> >     a or
> >      > vice versa).
> >      >
> >      > On Sun, Jan 14, 2024 at 3:44 PM Simon Harmel
> >     <sim.harmel using gmail.com <mailto:sim.harmel using gmail.com>
> >      > <mailto:sim.harmel using gmail.com <mailto:sim.harmel using gmail.com>>>
> wrote:
> >      >
> >      >     Thanks so much, Ben. I conclude that the users of lme() don't
> >     need
> >      >     to convert the output for "varIdent()" or "varPower()" back
> >     due to a
> >      >     link function to get the estimates of the relevant Level-1
> >     residual
> >      >     variances. This is because the former gives out the
> proportions
> >      >     between residual variances with respect to a reference level
> in a
> >      >     categorical variable, and the latter gives out "t", which the
> >     user
> >      >     can insert in: (sigma(MODEL)^2)*abs(data$X2_numeric)^(2*t) to
> >     obtain
> >      >     the relationship between X2_numeric and the residual variance.
> >      >
> >      >     Ben, as both a mathematical and applied expert, which
> >     location-scale
> >      >     approach, do you think, is more preferable? The one
> >     implemented in
> >      >     nlme() or the one that allows modeling the scale using:
> >      >     log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p
> >     predictors of
> >      >     scale) ??
> >      >
> >      >     Thank you so very much for sharing your perspective,
> >      >     Simon
> >      >
> >      >
> >      >
> >      >
> >      >
> >      >     On Sun, Jan 14, 2024 at 2:00 PM Ben Bolker <bbolker using gmail.com
> >     <mailto:bbolker using gmail.com>
> >      >     <mailto:bbolker using gmail.com <mailto:bbolker using gmail.com>>> wrote:
> >      >
> >      >              For varIdent (from ?nlme::varIdent),
> >      >
> >      >            For identifiability reasons, the
> >      >                coefficients of the variance function represent the
> >      >         ratios between
> >      >                the variances and a reference variance
> >     (corresponding to a
> >      >                reference group level).
> >      >
> >      >             I assume that this is internally parameterized via
> >     something
> >      >         like (1)
> >      >         a model matrix constructed with ~ <grouping factor> and
> (2) a
> >      >         log link,
> >      >         to ensure that the ratios are all positive
> >      >
> >      >            For varPower,
> >      >
> >      >         s2(v) = |v|^(2*t)
> >      >
> >      >            -- notice it uses the absolute value of the covariate.
> So
> >      >         that term
> >      >         will also be positive.
> >      >
> >      >         varComb uses a product; the product of two positive
> >     values will
> >      >         also be
> >      >         positive ...
> >      >
> >      >         On 2024-01-14 11:40 a.m., Simon Harmel wrote:
> >      >          > Dear Ben and List Members,
> >      >          >
> >      >          > I'm following up on this
> >      >          >
> >      >
> >       (
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html> <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>>
> >      >          >
> >      >
> >       <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html> <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2023q4/030552.html>>>)
> >      >          > thread. There, Ben noted that my MODEL (below)
> >     qualifies as a
> >      >          > "location-scale" model.
> >      >          >
> >      >          > Q: Usually for the scale part of a location-scale
> >     model, the
> >      >         linear
> >      >          > model uses a log link to guarantee that the estimate
> >     of scale
> >      >         is positive:
> >      >          >
> >      >          > log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip  (for p
> >      >         predictors of scale)
> >      >          >
> >      >          > But in the MODEL that I sketched below, how such a
> >      >         guarantee is made?
> >      >          >
> >      >          > Thanks, Simon
> >      >          > MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
> >      >          >           random = ~1| subject,
> >      >          >           data = data,
> >      >          >           correlation = corSymm(~1|subject),
> >      >          >           weights = varComb(varIdent(form = ~ 1 |
> >      >         X1_categorical ),
> >      >          >
> >     varPower(form = ~
> >      >         X2_numeric )))
> >      >          >
> >      >
> >      >         --
> >      >         Dr. Benjamin Bolker
> >      >         Professor, Mathematics & Statistics and Biology, McMaster
> >     University
> >      >         Director, School of Computational Science and Engineering
> >      >         (Acting) Graduate chair, Mathematics & Statistics
> >      >           > E-mail is sent at my convenience; I don't expect
> replies
> >      >         outside of
> >      >         working hours.
> >      >
> >
> >     --
> >     Dr. Benjamin Bolker
> >     Professor, Mathematics & Statistics and Biology, McMaster University
> >     Director, School of Computational Science and Engineering
> >     (Acting) Graduate chair, Mathematics & Statistics
> >       > E-mail is sent at my convenience; I don't expect replies outside
> of
> >     working hours.
> >
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> (Acting) Graduate chair, Mathematics & Statistics
>  > E-mail is sent at my convenience; I don't expect replies outside of
> working hours.
>

	[[alternative HTML version deleted]]