[R-sig-ME] gls error terms and (co)variances

Thu Mar 10 13:15:27 CET 2022

Dear list,

I have two questions about model-predicted (co)variances by gls from the
nlme package.

Question 1)

For longitudinal data, 4 fixed occasions in time for each respondent, no
missings, I estimated a gls model with the fixed occasions as predictors,
and an unstructured (co)variance matrix:

UNSTRUC1 <- gls(read ~ 1+occf, data=da, method="REML",
                       correlation=corSymm(form = ~ occasion|id),
                       weights = varIdent(form = ~1|occasion))

"occasion" runs from 1 to 4, "occf" is the factor version of occasion.

The model predicted (co)variances/correlations for the four occasions are
almost, but not exactly, equal to the (co)variances/correlations of the
residuals found with:

resid <- residuals(UNSTRUC1)

I'm wondering why(small) discrepancies exist. When adding extra predictor
variables to the above model, the discrepancies between (co)variances
extracted from the gls (model) object, and those calculated on the
residuals in "resid" get larger. Using the var( ) function to obtain the
variances of the residuals in  "resid", the number of df = n-1. Even after
correcting this into n-4 for the above model, or into n-6 when to extra
predictors are added, the discrepancies remain. I must be overlooking
something, but what? Btw, I used a small routine corandcov(  ), found on
site https://rdrr.io/github/emilelatour/lamisc/src/R/corandcov.R, to
extract model predicted (co)variances and correlation for gls models.

Question 2)

For the above model, the predicted correlations for the four occasions are
(almost) equal to the observed correlations, as expected. When comparing
different correlation structures, like compound symmetry, ar1, toeplitz,
etcetera, one can compare the observed correlations with those predicted by
all these models, to decide "at face value" which model predicted
correlations come closest to the observed correlations. Of course, the
unstructured pattern is always "closest" but how well do the other patterns
fit (or not)? In addition to other fit-measures like AIC, BIC, deviance,
comparing the predicted correlations with the observed could help to asses
the model's quality. However, if there are extra predictors involved, gls
still produces predicted correlations across the four occasions, but there
is no simple "observed correlation matrix" to compare the predicted
correlation matrix with. I tried a linear regression model, with occf and
the extra predictors, and calculated the variances and correlations of the
residuals, but these are in general rather different from the ones gls
produces. Do you have any idea how to calculate such an "observed"
correlation matrix in case there are more predictors than just the occasion
factor occf factor?

Thanks for any help, Ben.

	[[alternative HTML version deleted]]