[R-sig-ME] results lme unstructured covariance matrix, again

Thu Jul 21 11:24:27 CEST 2022

Dear John,

Thanks again for your reply and good advice. Indeed, the choice of model
should be guided by theory or, say, practical knowledge of the data at
hand. And indeed, the "random time" approach could definitely be a good way
to think of the underlying mechanism that generated the data, apart from
whether or not lme or lmer is able to produce meaningful estimates. That
would only be a technical issue. Thanks for making this more clear!

Ben.

On Tue, 19 Jul 2022 at 19:23, John Fox <jfox using mcmaster.ca> wrote:

> Dear Ben,
>
> The discussion is starting to stray from software issues, so I'll try to
> answer briefly.
>
> I think that you're emphasizing the wrong point, which is how to specify
> a model that produces estimates, rather than specifying a random-effects
> structure that's reasonable for the data.
>
> As you point out, an unrestricted variance-covariance structure for the
> model you originally specified is underidentified, as is a similar model
> with 3 within-subjects measurements. As you also correctly point out,
> fitting 3 occasions with 2 dummy regressors is equivalent to fitting a
> quadratic -- both have 3 parameters (including the intercept).
>
> Placing a restriction on the random effects, such as setting the
> observation-level error variance to 0, can identify the
> variance-covariance parameters, as can other sorts of restrictions. If
> just one constraint is placed on the variances and covariances of the
> random effects, the resulting variance-covariance parameters will be
> just-identified, and different such constraints are equivalent, in the
> sense of producing the same likelihood and fixed-effects estimates. You
> can see how this works by looking at the equations I wrote out for the
> variance-covariance components.
>
> This is simply a mechanical point, and in general there's no reason to
> prefer one constraint to another, but in a real application one model
> may be more reasonable than another for the data at hand. For example,
> the data I generated has only one true random effect -- the observation
> level error -- and so a correct model for the data has only one
> variance-covariance component -- the error variance. lm(y ~ t2, data=da)
> fits this model, which is much more restrictive than a single constraint
> on the more general mixed model. More elaborate random-effect
> specifications for my data have superfluous parameters, whether or not
> they're identified.
>
> Best,
>   John
>
> On 2022-07-19 8:50 a.m., ben pelzer wrote:
> > Dear John, Wolfgang, James and Ben,
> >
> > Thanks to you all for your explanations and examples! As I understand it
> > now, cancelling the estimation of the sigma, by fixing it to a small
> > positive number close to zero, produces random effects estimates of the
> > intercept and t2 (or of t1 and t2) which make sense. However, the
> > loglikelihood doesn't make sense. I guess that with sigma fixed to nearly
> > zero, the predictions will be meaningful too, even when including the
> > random effects in the predictions, though I did not verify that yet.
> >
> > All in all, I think that for a model with an unstructured covariance
> > matrix, "gls" is more straightforward. For the example of John:
> >
> > da$time <- da$t2 + 1
> > m4 <- gls(y ~ 1 + t2,
> >            correlation=corSymm(form= ~time|person),
> >            weights=varIdent(form= ~1|time),
> >            data=da)
> >
> > and the results are equal to e.g.
> >
> > m2 <- lme(y ~ 1 + t2, random = ~ 1 + t2 | person, data=da)
> >
> > as far as the loglikelihood and the estimated (co)variances (getVarCov)
> are
> > concerned.
> >
> > My question about "lme" arose when helping someone with such an
> > unstructured covariances model, for data with three fixed occasions. With
> > "many" occasions, a random linear effect of time is often employed, or
> > maybe a random quadratic time effect or cubic... Anyway, time is used as
> a
> > random (linear etc) effect in such models. But with only three occasions,
> > and with the time effect not being linear but quadratic, the model turns
> > into the kind of model I was questioning about, but with three occasions
> > instead of two. So, instead of
> >
> > modelA <- lme(y ~ 1+time+timesquared, random= ~ 1+time+timesquared |
> > person, data=da)
> >
> > one could then use
> >
> > modelB <- lme(y ~ 1+t2+t3, random= ~ 1+t2+t3 | person, data=da)
> >
> > But the random effect variances of t2 and t3 are not meaningful, as I
> know
> > now. And with the often used "lmer" the model is not estimated (except
> when
> > avoiding the "counting check" as Ben Bolker explained). This means that
> the
> > "random time effect" models, which I explain to students in a course, is
> > not applicable to such data. I thought/hoped that "lme" could be used,
> and
> > in a way it can, with fixing the sigma, but apparently this also leads to
> > some "unexpected" results. On the other hand, "gls" is a new method for
> > students used to the "random time effect" models. So I have to choose
> among
> > these two possibilities.
> >
> > Thanks again everyone, time to switch off the computer and enjoy the sun.
> > It's almost 40 degrees Celsius or 104 Farenheit in The Netherlands now,
> so
> > maybe enjoying is not the right word. Best to you all,
> >
> > Ben Pelzer.
> >
> >
> >
> >
> >
> > On Tue, 19 Jul 2022 at 04:15, John Fox <jfox using mcmaster.ca> wrote:
> >
> >> Dear Ben,
> >>
> >> On 2022-07-18 1:32 p.m., John Fox wrote:
> >>> Dear Ben,
> >>>
> >>> My apologies---I paid attention to the inessential part of your
> message,
> >>> which was the parametrization of the model rather than the
> >>> variance-covariance structure.
> >>>
> >>> This is a little awkward because of the LaTeX (and as far as I know, I
> >>> can't attach a PDF, but maybe you can paste this into a LaTeX editor):
> >>
> >> Rolf Turner pointed out to me that PDF attachments *are* acceptable, and
> >> so I've attached a PDF with the LaTeX part of my message.
> >>
> >> Best,
> >> John
> >>
> >>>
> >>> The model is $Y_{ij} = \alpha + \beta x_{ij} + \delta_{1i} +
> \delta_{2i}
> >>> x_{ij} + \varepsilon_{ij}$ for $i = 1, \ldots, n$ and $j = 1,2$, where
> >>> $x_{i1} = 1$ and $x_{i2} = 0$.
> >>>
> >>> The variances of the random effects are $V(\delta_{1i}) = \psi_1^2$ and
> >>> $V(\delta_{2i}) = \psi_2^2$, and their covariance is $C(\delta_{1i},
> >>> \delta_{2i}) = \psi_{12}$. The observation-level error variance is
> >>> $V(\varepsilon_{ij}) = \sigma^2$.
> >>>
> >>> Then the composite error is $\zeta_{ij} = \delta_{1i} + \delta_{2i}
> >>> x_{ij} + \varepsilon_{ij}$ with variance $V(\zeta_{ij}) =    \psi_1^2 +
> >>> x^2_{ij} \psi_2^2 + 2 x_{ij} \psi_{12} + \sigma^2$, which is
> >>> $V(\zeta_{i1}) = \psi_1^2 + \psi_2^2 + 2\psi_{12} + \sigma^2$ for $j =
> >>> 1$ and  $V(\zeta_{i2}) =   \psi_1^2  + \sigma^2$ for $j = 2$, and
> >>> covariance $C(\zeta_{i1}, \zeta_{i2}) = \psi_1^2 + \psi_2^2$.
> >>>
> >>> There are, as you say, 4 variance-covariance components, $\psi_1^2,
> >>> \psi_2^2, \psi_{12}, \sigma^2$, and just 3 variances and covariances
> >>> among the composite disturbances, $V(\zeta_{i1}), V(\zeta_{i2}),
> >>> C(\zeta_{i1}, \zeta_{i2})$, and so---as you and several others have
> >>> noted (but I missed)---the variance-covariance components are
> >>> underidentified.
> >>>
> >>> lmer(), but not lme(), appears to use a heuristic, counting the number
> >>> of random effects and observations, to detect underidentification. I
> >>> don't know whether that's bullet-proof, but it works in this case.
> Maybe
> >>> Ben Bolker, who has already responded, can comment further.
> >>>
> >>> Best,
> >>>    John
> >>>
> >>> John Fox, Professor Emeritus
> >>> McMaster University
> >>> Hamilton, Ontario, Canada
> >>> web: https://socialsciences.mcmaster.ca/jfox/
> >>>
> >>> On 2022-07-18 6:39 a.m., ben pelzer wrote:
> >>>> Dear John,
> >>>>
> >>>> Thank you for answering my question and your nice example with the
> >>>> alternative model formulations!!. But still I'm in doubt about the
> >>>> results. The fact that there are only three observed (co)variances
> >>>> which are being reproduced by four parameters estimates of lme leaves
> >>>> me confused. Actually, I would think that there is no unique solution
> >>>> without some constraint being applied. But I could not find something
> >>>> about such a constraint in lme documentation. The random effects of t1
> >>>> and t2, or of the intercept and t2 in your alternative model, should
> >>>> be sufficient to reproduce the observed (co)variances, so the residual
> >>>> variance is "unnecassary". I guess that is the reason that lmer is not
> >>>> able to estimate the model.
> >>>>
> >>>> library(lme4)
> >>>> m3 <- lmer(y ~ 1+t2+(1+t2|person), data=da)
> >>>>
> >>>> Error: number of observations (=20) <= number of random effects (=20)
> >>>> for term (1 + t2 | person); the random-effects parameters and the
> >>>> residual variance (or scale parameter) are probably unidentifiable
> >>>>
> >>>>
> >>>> It's interesting to notice that the two model specifications (t1+t2
> >>>> and 1+t2) lead to different residual variances estimated by lme. What
> >>>> do these residual variances mean? And of course also: what do the
> >>>> random effect variances estimated by both model formulations actually
> >>>> mean or stand for? Do you have any ideas? Kind regards,
> >>>>
> >>>> Ben.
> >>>>
> >>>> On Sat, 16 Jul 2022 at 16:50, John Fox <jfox using mcmaster.ca
> >>>> <mailto:jfox using mcmaster.ca>> wrote:
> >>>>
> >>>>      Dear Ben,
> >>>>
> >>>>      First, I'll make this into a reproducible example:
> >>>>
> >>>>        > set.seed(123)
> >>>>        > t1 <- c(rep(1, 10), rep(0, 10))
> >>>>        > t2 <- 1 - t1
> >>>>        > person <- rep(1:10, 2)
> >>>>        > y <- t2 + rnorm(20)
> >>>>        > da <- data.frame(y, t1, t2, person)
> >>>>
> >>>>        > library(nlme)
> >>>>
> >>>>      Then note that the random-effect specification 0 + t1 + t2 is
> >>>> simply a
> >>>>      reparametrization of 1 + t2 (i.e., 1 = t1 + t2), which produces
> the
> >>>>      same
> >>>>      fit to the data (same fixed effects, same restricted
> >> log-likelihood):
> >>>>
> >>>>        > m1 <- lme(y ~ 1 + t2, random = ~ 0 + t1 + t2 | person,
> data=da)
> >>>>        > m2 <- lme(y ~ 1 + t2, random = ~ 1 + t2 | person, data=da)
> >>>>        > m1
> >>>>      Linear mixed-effects model fit by REML
> >>>>          Data: da
> >>>>          Log-restricted-likelihood: -25.92726
> >>>>          Fixed: y ~ 1 + t2
> >>>>      (Intercept)          t2
> >>>>         0.07462564  1.13399632
> >>>>
> >>>>      Random effects:
> >>>>         Formula: ~0 + t1 + t2 | person
> >>>>         Structure: General positive-definite, Log-Cholesky
> >> parametrization
> >>>>                 StdDev    Corr
> >>>>      t1       0.8964136 t1
> >>>>      t2       0.9856215 0.647
> >>>>      Residual 0.3258015
> >>>>
> >>>>      Number of Observations: 20
> >>>>      Number of Groups: 10
> >>>>
> >>>>        > m2
> >>>>      Linear mixed-effects model fit by REML
> >>>>          Data: da
> >>>>          Log-restricted-likelihood: -25.92726
> >>>>          Fixed: y ~ 1 + t2
> >>>>      (Intercept)          t2
> >>>>         0.07462564  1.13399632
> >>>>
> >>>>      Random effects:
> >>>>         Formula: ~1 + t2 | person
> >>>>         Structure: General positive-definite, Log-Cholesky
> >> parametrization
> >>>>                    StdDev    Corr
> >>>>      (Intercept) 0.8787887 (Intr)
> >>>>      t2          0.7540826 -0.302
> >>>>      Residual    0.3707215
> >>>>
> >>>>      Number of Observations: 20
> >>>>      Number of Groups: 10
> >>>>
> >>>>      Finally, it's unnecessary to supply the intercept 1 in the model
> >>>>      formula
> >>>>      since the intercept is implied if it's not explicitly excluded:
> >>>>
> >>>>        > m3 <- lme(y ~ t2, random = ~ t2 | person, data=da)
> >>>>        > m3
> >>>>      Linear mixed-effects model fit by REML
> >>>>          Data: da
> >>>>          Log-restricted-likelihood: -25.92726
> >>>>          Fixed: y ~ t2
> >>>>      (Intercept)          t2
> >>>>         0.07462564  1.13399632
> >>>>
> >>>>      Random effects:
> >>>>         Formula: ~t2 | person
> >>>>         Structure: General positive-definite, Log-Cholesky
> >> parametrization
> >>>>                    StdDev    Corr
> >>>>      (Intercept) 0.8787887 (Intr)
> >>>>      t2          0.7540826 -0.302
> >>>>      Residual    0.3707215
> >>>>
> >>>>      Number of Observations: 20
> >>>>      Number of Groups: 10
> >>>>
> >>>>      I hope this helps,
> >>>>         John
> >>>>
> >>>>      On 2022-07-16 10:05 a.m., ben pelzer wrote:
> >>>>       > Sorry, my previous mailed contained another question which is
> >>>>      irrelevant...
> >>>>       > I deleted that now.
> >>>>       >
> >>>>       >
> >>>>       > Hi all,
> >>>>       >
> >>>>       > I have a question about results from lme of package nlme.
> >>>>       >
> >>>>       > Suppose the data consists of repeated measures at two fixed
> time
> >>>>      points.
> >>>>       >
> >>>>       > I used the following equation:
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > Model1 <- lme ( y ~ 1+t2 , random = ~ 0 + t1+t2|person,
> data=da)
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > y is the dependent, t1 and t2 are binary dummy variables,
> valued
> >>>>      0 or 1,
> >>>>       > indicating the time point.  Model1 is estimated without any
> >>>>      convergence
> >>>>       > problems and the reproduced (co)variances found with
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > getVarCov(Model1, type=”marginal”, indivual=”1”)
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > are identical to the observed (co)variances.
> >>>>       >
> >>>>       >
> >>>>       > My question is:  how can lme estimate 4 (co)variances with
> only 3
> >>>>      known
> >>>>       > (co)variances?
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > The 4 estimates concern:
> >>>>       >
> >>>>       > -          std. deviation of the random effect of dummy t1
> >>>>       >
> >>>>       > -          std. deviation of the random effect of dummy t2
> >>>>       >
> >>>>       > -          covariance of the random effects of the dummies t1
> and
> >>>>      t2 t1
> >>>>       >
> >>>>       > -          residual std. error
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > Related to the question above: how can the variances of the
> >>>>      random effects
> >>>>       > and the residual std. error be interpreted?
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > Thanks for any help,
> >>>>       >
> >>>>       >
> >>>>       >
> >>>>       > Ben.
> >>>>       >
> >>>>       >       [[alternative HTML version deleted]]
> >>>>       >
> >>>>       > _______________________________________________
> >>>>       > R-sig-mixed-models using r-project.org
> >>>>      <mailto:R-sig-mixed-models using r-project.org> mailing list
> >>>>       > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >>>>      --     John Fox, Professor Emeritus
> >>>>      McMaster University
> >>>>      Hamilton, Ontario, Canada
> >>>>      web: https://socialsciences.mcmaster.ca/jfox/
> >>>>      <https://socialsciences.mcmaster.ca/jfox/>
> >>>>
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> --
> >> John Fox, Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> web: https://socialsciences.mcmaster.ca/jfox/
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>

	[[alternative HTML version deleted]]