[R-sig-ME] results lme unstructured covariance matrix, again

Tue Jul 19 19:23:04 CEST 2022

Dear Ben,

The discussion is starting to stray from software issues, so I'll try to 
answer briefly.

I think that you're emphasizing the wrong point, which is how to specify 
a model that produces estimates, rather than specifying a random-effects 
structure that's reasonable for the data.

As you point out, an unrestricted variance-covariance structure for the 
model you originally specified is underidentified, as is a similar model 
with 3 within-subjects measurements. As you also correctly point out, 
fitting 3 occasions with 2 dummy regressors is equivalent to fitting a 
quadratic -- both have 3 parameters (including the intercept).

Placing a restriction on the random effects, such as setting the 
observation-level error variance to 0, can identify the 
variance-covariance parameters, as can other sorts of restrictions. If 
just one constraint is placed on the variances and covariances of the 
random effects, the resulting variance-covariance parameters will be 
just-identified, and different such constraints are equivalent, in the 
sense of producing the same likelihood and fixed-effects estimates. You 
can see how this works by looking at the equations I wrote out for the 
variance-covariance components.

This is simply a mechanical point, and in general there's no reason to 
prefer one constraint to another, but in a real application one model 
may be more reasonable than another for the data at hand. For example, 
the data I generated has only one true random effect -- the observation 
level error -- and so a correct model for the data has only one 
variance-covariance component -- the error variance. lm(y ~ t2, data=da) 
fits this model, which is much more restrictive than a single constraint 
on the more general mixed model. More elaborate random-effect 
specifications for my data have superfluous parameters, whether or not 
they're identified.

Best,
  John

On 2022-07-19 8:50 a.m., ben pelzer wrote:
> Dear John, Wolfgang, James and Ben,
> 
> Thanks to you all for your explanations and examples! As I understand it
> now, cancelling the estimation of the sigma, by fixing it to a small
> positive number close to zero, produces random effects estimates of the
> intercept and t2 (or of t1 and t2) which make sense. However, the
> loglikelihood doesn't make sense. I guess that with sigma fixed to nearly
> zero, the predictions will be meaningful too, even when including the
> random effects in the predictions, though I did not verify that yet.
> 
> All in all, I think that for a model with an unstructured covariance
> matrix, "gls" is more straightforward. For the example of John:
> 
> da$time <- da$t2 + 1
> m4 <- gls(y ~ 1 + t2,
>            correlation=corSymm(form= ~time|person),
>            weights=varIdent(form= ~1|time),
>            data=da)
> 
> and the results are equal to e.g.
> 
> m2 <- lme(y ~ 1 + t2, random = ~ 1 + t2 | person, data=da)
> 
> as far as the loglikelihood and the estimated (co)variances (getVarCov) are
> concerned.
> 
> My question about "lme" arose when helping someone with such an
> unstructured covariances model, for data with three fixed occasions. With
> "many" occasions, a random linear effect of time is often employed, or
> maybe a random quadratic time effect or cubic... Anyway, time is used as a
> random (linear etc) effect in such models. But with only three occasions,
> and with the time effect not being linear but quadratic, the model turns
> into the kind of model I was questioning about, but with three occasions
> instead of two. So, instead of
> 
> modelA <- lme(y ~ 1+time+timesquared, random= ~ 1+time+timesquared |
> person, data=da)
> 
> one could then use
> 
> modelB <- lme(y ~ 1+t2+t3, random= ~ 1+t2+t3 | person, data=da)
> 
> But the random effect variances of t2 and t3 are not meaningful, as I know
> now. And with the often used "lmer" the model is not estimated (except when
> avoiding the "counting check" as Ben Bolker explained). This means that the
> "random time effect" models, which I explain to students in a course, is
> not applicable to such data. I thought/hoped that "lme" could be used, and
> in a way it can, with fixing the sigma, but apparently this also leads to
> some "unexpected" results. On the other hand, "gls" is a new method for
> students used to the "random time effect" models. So I have to choose among
> these two possibilities.
> 
> Thanks again everyone, time to switch off the computer and enjoy the sun.
> It's almost 40 degrees Celsius or 104 Farenheit in The Netherlands now, so
> maybe enjoying is not the right word. Best to you all,
> 
> Ben Pelzer.
> 
> 
> 
> 
> 
> On Tue, 19 Jul 2022 at 04:15, John Fox <jfox using mcmaster.ca> wrote:
> 
>> Dear Ben,
>>
>> On 2022-07-18 1:32 p.m., John Fox wrote:
>>> Dear Ben,
>>>
>>> My apologies---I paid attention to the inessential part of your message,
>>> which was the parametrization of the model rather than the
>>> variance-covariance structure.
>>>
>>> This is a little awkward because of the LaTeX (and as far as I know, I
>>> can't attach a PDF, but maybe you can paste this into a LaTeX editor):
>>
>> Rolf Turner pointed out to me that PDF attachments *are* acceptable, and
>> so I've attached a PDF with the LaTeX part of my message.
>>
>> Best,
>> John
>>
>>>
>>> The model is $Y_{ij} = \alpha + \beta x_{ij} + \delta_{1i} + \delta_{2i}
>>> x_{ij} + \varepsilon_{ij}$ for $i = 1, \ldots, n$ and $j = 1,2$, where
>>> $x_{i1} = 1$ and $x_{i2} = 0$.
>>>
>>> The variances of the random effects are $V(\delta_{1i}) = \psi_1^2$ and
>>> $V(\delta_{2i}) = \psi_2^2$, and their covariance is $C(\delta_{1i},
>>> \delta_{2i}) = \psi_{12}$. The observation-level error variance is
>>> $V(\varepsilon_{ij}) = \sigma^2$.
>>>
>>> Then the composite error is $\zeta_{ij} = \delta_{1i} + \delta_{2i}
>>> x_{ij} + \varepsilon_{ij}$ with variance $V(\zeta_{ij}) =    \psi_1^2 +
>>> x^2_{ij} \psi_2^2 + 2 x_{ij} \psi_{12} + \sigma^2$, which is
>>> $V(\zeta_{i1}) = \psi_1^2 + \psi_2^2 + 2\psi_{12} + \sigma^2$ for $j =
>>> 1$ and  $V(\zeta_{i2}) =   \psi_1^2  + \sigma^2$ for $j = 2$, and
>>> covariance $C(\zeta_{i1}, \zeta_{i2}) = \psi_1^2 + \psi_2^2$.
>>>
>>> There are, as you say, 4 variance-covariance components, $\psi_1^2,
>>> \psi_2^2, \psi_{12}, \sigma^2$, and just 3 variances and covariances
>>> among the composite disturbances, $V(\zeta_{i1}), V(\zeta_{i2}),
>>> C(\zeta_{i1}, \zeta_{i2})$, and so---as you and several others have
>>> noted (but I missed)---the variance-covariance components are
>>> underidentified.
>>>
>>> lmer(), but not lme(), appears to use a heuristic, counting the number
>>> of random effects and observations, to detect underidentification. I
>>> don't know whether that's bullet-proof, but it works in this case. Maybe
>>> Ben Bolker, who has already responded, can comment further.
>>>
>>> Best,
>>>    John
>>>
>>> John Fox, Professor Emeritus
>>> McMaster University
>>> Hamilton, Ontario, Canada
>>> web: https://socialsciences.mcmaster.ca/jfox/
>>>
>>> On 2022-07-18 6:39 a.m., ben pelzer wrote:
>>>> Dear John,
>>>>
>>>> Thank you for answering my question and your nice example with the
>>>> alternative model formulations!!. But still I'm in doubt about the
>>>> results. The fact that there are only three observed (co)variances
>>>> which are being reproduced by four parameters estimates of lme leaves
>>>> me confused. Actually, I would think that there is no unique solution
>>>> without some constraint being applied. But I could not find something
>>>> about such a constraint in lme documentation. The random effects of t1
>>>> and t2, or of the intercept and t2 in your alternative model, should
>>>> be sufficient to reproduce the observed (co)variances, so the residual
>>>> variance is "unnecassary". I guess that is the reason that lmer is not
>>>> able to estimate the model.
>>>>
>>>> library(lme4)
>>>> m3 <- lmer(y ~ 1+t2+(1+t2|person), data=da)
>>>>
>>>> Error: number of observations (=20) <= number of random effects (=20)
>>>> for term (1 + t2 | person); the random-effects parameters and the
>>>> residual variance (or scale parameter) are probably unidentifiable
>>>>
>>>>
>>>> It's interesting to notice that the two model specifications (t1+t2
>>>> and 1+t2) lead to different residual variances estimated by lme. What
>>>> do these residual variances mean? And of course also: what do the
>>>> random effect variances estimated by both model formulations actually
>>>> mean or stand for? Do you have any ideas? Kind regards,
>>>>
>>>> Ben.
>>>>
>>>> On Sat, 16 Jul 2022 at 16:50, John Fox <jfox using mcmaster.ca
>>>> <mailto:jfox using mcmaster.ca>> wrote:
>>>>
>>>>      Dear Ben,
>>>>
>>>>      First, I'll make this into a reproducible example:
>>>>
>>>>        > set.seed(123)
>>>>        > t1 <- c(rep(1, 10), rep(0, 10))
>>>>        > t2 <- 1 - t1
>>>>        > person <- rep(1:10, 2)
>>>>        > y <- t2 + rnorm(20)
>>>>        > da <- data.frame(y, t1, t2, person)
>>>>
>>>>        > library(nlme)
>>>>
>>>>      Then note that the random-effect specification 0 + t1 + t2 is
>>>> simply a
>>>>      reparametrization of 1 + t2 (i.e., 1 = t1 + t2), which produces the
>>>>      same
>>>>      fit to the data (same fixed effects, same restricted
>> log-likelihood):
>>>>
>>>>        > m1 <- lme(y ~ 1 + t2, random = ~ 0 + t1 + t2 | person, data=da)
>>>>        > m2 <- lme(y ~ 1 + t2, random = ~ 1 + t2 | person, data=da)
>>>>        > m1
>>>>      Linear mixed-effects model fit by REML
>>>>          Data: da
>>>>          Log-restricted-likelihood: -25.92726
>>>>          Fixed: y ~ 1 + t2
>>>>      (Intercept)          t2
>>>>         0.07462564  1.13399632
>>>>
>>>>      Random effects:
>>>>         Formula: ~0 + t1 + t2 | person
>>>>         Structure: General positive-definite, Log-Cholesky
>> parametrization
>>>>                 StdDev    Corr
>>>>      t1       0.8964136 t1
>>>>      t2       0.9856215 0.647
>>>>      Residual 0.3258015
>>>>
>>>>      Number of Observations: 20
>>>>      Number of Groups: 10
>>>>
>>>>        > m2
>>>>      Linear mixed-effects model fit by REML
>>>>          Data: da
>>>>          Log-restricted-likelihood: -25.92726
>>>>          Fixed: y ~ 1 + t2
>>>>      (Intercept)          t2
>>>>         0.07462564  1.13399632
>>>>
>>>>      Random effects:
>>>>         Formula: ~1 + t2 | person
>>>>         Structure: General positive-definite, Log-Cholesky
>> parametrization
>>>>                    StdDev    Corr
>>>>      (Intercept) 0.8787887 (Intr)
>>>>      t2          0.7540826 -0.302
>>>>      Residual    0.3707215
>>>>
>>>>      Number of Observations: 20
>>>>      Number of Groups: 10
>>>>
>>>>      Finally, it's unnecessary to supply the intercept 1 in the model
>>>>      formula
>>>>      since the intercept is implied if it's not explicitly excluded:
>>>>
>>>>        > m3 <- lme(y ~ t2, random = ~ t2 | person, data=da)
>>>>        > m3
>>>>      Linear mixed-effects model fit by REML
>>>>          Data: da
>>>>          Log-restricted-likelihood: -25.92726
>>>>          Fixed: y ~ t2
>>>>      (Intercept)          t2
>>>>         0.07462564  1.13399632
>>>>
>>>>      Random effects:
>>>>         Formula: ~t2 | person
>>>>         Structure: General positive-definite, Log-Cholesky
>> parametrization
>>>>                    StdDev    Corr
>>>>      (Intercept) 0.8787887 (Intr)
>>>>      t2          0.7540826 -0.302
>>>>      Residual    0.3707215
>>>>
>>>>      Number of Observations: 20
>>>>      Number of Groups: 10
>>>>
>>>>      I hope this helps,
>>>>         John
>>>>
>>>>      On 2022-07-16 10:05 a.m., ben pelzer wrote:
>>>>       > Sorry, my previous mailed contained another question which is
>>>>      irrelevant...
>>>>       > I deleted that now.
>>>>       >
>>>>       >
>>>>       > Hi all,
>>>>       >
>>>>       > I have a question about results from lme of package nlme.
>>>>       >
>>>>       > Suppose the data consists of repeated measures at two fixed time
>>>>      points.
>>>>       >
>>>>       > I used the following equation:
>>>>       >
>>>>       >
>>>>       >
>>>>       > Model1 <- lme ( y ~ 1+t2 , random = ~ 0 + t1+t2|person, data=da)
>>>>       >
>>>>       >
>>>>       >
>>>>       > y is the dependent, t1 and t2 are binary dummy variables, valued
>>>>      0 or 1,
>>>>       > indicating the time point.  Model1 is estimated without any
>>>>      convergence
>>>>       > problems and the reproduced (co)variances found with
>>>>       >
>>>>       >
>>>>       >
>>>>       > getVarCov(Model1, type=”marginal”, indivual=”1”)
>>>>       >
>>>>       >
>>>>       >
>>>>       > are identical to the observed (co)variances.
>>>>       >
>>>>       >
>>>>       > My question is:  how can lme estimate 4 (co)variances with only 3
>>>>      known
>>>>       > (co)variances?
>>>>       >
>>>>       >
>>>>       >
>>>>       > The 4 estimates concern:
>>>>       >
>>>>       > -          std. deviation of the random effect of dummy t1
>>>>       >
>>>>       > -          std. deviation of the random effect of dummy t2
>>>>       >
>>>>       > -          covariance of the random effects of the dummies t1 and
>>>>      t2 t1
>>>>       >
>>>>       > -          residual std. error
>>>>       >
>>>>       >
>>>>       >
>>>>       > Related to the question above: how can the variances of the
>>>>      random effects
>>>>       > and the residual std. error be interpreted?
>>>>       >
>>>>       >
>>>>       >
>>>>       > Thanks for any help,
>>>>       >
>>>>       >
>>>>       >
>>>>       > Ben.
>>>>       >
>>>>       >       [[alternative HTML version deleted]]
>>>>       >
>>>>       > _______________________________________________
>>>>       > R-sig-mixed-models using r-project.org
>>>>      <mailto:R-sig-mixed-models using r-project.org> mailing list
>>>>       > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>>      --     John Fox, Professor Emeritus
>>>>      McMaster University
>>>>      Hamilton, Ontario, Canada
>>>>      web: https://socialsciences.mcmaster.ca/jfox/
>>>>      <https://socialsciences.mcmaster.ca/jfox/>
>>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> --
>> John Fox, Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> web: https://socialsciences.mcmaster.ca/jfox/
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/