[R-sig-ME] reference category in varIdent function (once again)

James Pustejovsky jepu@to @end|ng |rom gm@||@com
Sun Dec 29 22:40:05 CET 2024


Hi Sebastian,

Thanks for your reply. This helped me figure out how to control the
reference level for the particular model I'm working on. I wrote up what
I've learned (in an excessive degree of detail) here:
https://jepusto.com/posts/varIdent-function-in-nlme/

James

On Mon, Dec 23, 2024 at 10:40 AM Sebastian Meyer <seb.meyer using fau.de> wrote:

> I think it is unfortunate that varIdent() selects the reference category
> based on how the data is sorted (using the first level that occurs).
> There is also a corresponding bug report
> <https://bugs.r-project.org/show_bug.cgi?id=18505>.
>
> Your example illustrates an additional issue with this behaviour,
> because lme() internally reorders the data by the grouping factor(s)
> before initialization, so the order of that factor ('Subject' in your
> example, with first level "M16", so male) will indirectly determine the
> reference level of varIdent(), regardless of how your data or strata
> levels are ordered originally.
>
> It seems that only if there are more than two strata, the reference
> level for varIdent() can be chosen via a named initial parameter
> 'value', leaving out the desired reference level, but I haven't tested
> if this works as intended with both gls() and lme(). It would then make
> sense to support such a choice also in the case of only two strata, so
> for a single parameter.
>
> Best wishes,
>
>         Sebastian Meyer
>
>
> Am 20.12.24 um 20:39 schrieb James Pustejovsky:
> > Greetings,
> >
> > I'm trying to fit a linear mixed effect model using nlme::lme() that
> > includes heteroskedastic variances by group, where the variance for one
> > group is fixed to a specific value. I *think* I can do this using
> > varIdent() and setting lmeControl(sigma = x) for a specified value x.
> > However, I've run into what is evidently an old problem: that there
> doesn't
> > appear to be any way to control the reference level of varIdent().
> >
> > As was noted in a very old post to R-SIG-mixed-models (
> > https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q3/020582.html),
> the
> > reference level of varIdent() within an lme() call is set to the most
> > common level of the grouping factor. For reasons that are too involved to
> > get into, I need to set the reference level to something else (in fact,
> the
> > least frequently occurring level), so it seems I'm stuck.
> >
> > Curiously, the behavior of varIdent() within a gls() call is a bit
> > different. As documented in this StackOverflow (
> >
> https://stackoverflow.com/questions/75929847/reference-category-in-varident-function-nlme-package-depends-on-data-order
> ),
> > the reference level used within gls() depends on the sort order of the
> > data. I've included below a minimal working example that demonstrates the
> > difference in behavior.
> >
> > Does anyone know of any work-around for setting the reference level of
> > varIdent() within an lme() call? I'd be grateful for any
> leads/suggestions
> > on what I could do (up to and including to just use a different package
> > already).
> >
> > Thanks,
> > James
> >
> >
> >
> ----------------------------------------------------------------------------
> > library(nlme)
> >
> > # Make reversed factor
> > table(Orthodont$Sex)
> > Orthodont$Sex_rev <- Orthodont$Sex
> > levels(Orthodont$Sex_rev) <- rev(levels(Orthodont$Sex))
> > table(Orthodont$Sex_rev)
> >
> > Ortho_male_first <- Orthodont[order(Orthodont$Sex),]
> > Ortho_female_first <- Orthodont[order(Orthodont$Sex, decreasing = TRUE),]
> >
> > # Fit gls models with heteroskedastic variance
> > gls1 <- gls(
> >    distance ~ age + Sex,
> >    weights = varIdent(form = ~ 1 | Sex),
> >    data = Ortho_male_first
> > )
> > gls2 <- update(gls1, data = Ortho_female_first)
> > gls3 <- update(gls1, weights = varIdent(form = ~ 1 |Sex_rev))
> > gls4 <- update(gls3, data = Ortho_female_first)
> >
> > # Parameterization depends on sort order
> > coef(gls1$modelStruct$varStruct)
> > coef(gls2$modelStruct$varStruct)
> > # But not on factor level order
> > coef(gls3$modelStruct$varStruct)
> > coef(gls4$modelStruct$varStruct)
> >
> > # Fit lme models with heteroskedastic level-1 variance
> >
> > lme1 <- lme(
> >    distance ~ age + Sex,
> >    random = ~ 1,
> >    weights = varIdent(form = ~ 1 | Sex),
> >    data = Ortho_male_first
> > )
> > lme2 <- update(lme1, data = Ortho_female_first)
> > lme3 <- update(lme1, weights = varIdent(form = ~ 1 |Sex_rev))
> > lme4 <- update(lme3, data = Ortho_female_first)
> >
> > # Parameterization depends neither on sort order nor on factor level
> order
> > coef(lme1$modelStruct$varStruct)
> > coef(lme2$modelStruct$varStruct)
> > coef(lme3$modelStruct$varStruct)
> > coef(lme4$modelStruct$varStruct)
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list