[R-sig-ME] Fixed effects cannot be interpreted if Y is transformed in (g)lme's [was: conditional vs. marginal coefficients in GLMM]

Mon Mar 14 07:50:26 CET 2011

Dear Mixed-modelers,

I am aware that the point is not as simple as I have stated in the subject line but I hope to pique your curiosity and thus provoke some answers.

I have come across this issue due to a reviewer request in one of our publications and as you all know it is sometimes worthwhile to argue with the reviewers and for other aspects one just tries to accommodate the reviewer's wishes. Nevertheless, I think there was one fundamental point raised by the reviewer or rather by the comments I received in response to my original posting that is worth discussing (and that unsettles me quite a bit).

In our research group, we are mostly using mixed-modeling techniques because we need to accommodate dependencies in the data of our experimental designs which involve repeated measurements and hierarchical nesting. We are usually only marginally interested in the actual (relative) size of the random effects.

Let's take a somewhat simplified version of our real problem: we have observed the occurrence of lameness in three consecutive seasons for 10 cows each on 36 farms. The farms additionally differed in the type of flooring they provide in the barns. Thus, we are interested in the risk of lameness in dependence of season (within effect) and type of flooring (between effect). To accommodate dependencies we have included a nested random effect of cows (repeated measurement) in farms (hierarchical nesting). We have originally implemented this model in glmmPQL (family= quasibinomial) but may switch to glmer (family= binomial, including an additional observation-level random effect to check for over-dispersion) because we have been asked to conduct LR-tests.

Now, of course, in such a generalized model the response is (logit-)transformed. But the point to be made is as relevant for mixed-models based on the normal distribution if the outcome variable needs to be transformed (e.g. log) to satisfy the statistical assumptions on error and random effects distributions.

The point raised is that the estimated parameters are reflecting the average reaction of the 'population' only on the transformed scale because the average (additive) random effects are zero only on that scale (conditional estimates). If model estimates are back-transformed or e.g. ORs are calculated, as in our example, then these values are biased because the random effects do not longer average out.

If my understanding is correct, the back-transformed model estimates still reflect the relative risk of the observational units, i.e. in our case we can still calculate ORs that reflect the risk in our cows to suffer from lameness if going from one season to the other and we can calculate ORs for our floor types but these are in the sense virtual in that they represent the relative risk as if one would switch from one floor type to another on any of the given farms (as if it were a within effect).

These ORs do not, however, reflect any direct estimate of "population-wide" occurrence of lameness (marginal estimates), i.e. the mean proportion of cows with lameness that you would find on a set of farms with a given floor type. In this case we need some other methods and GEE as been suggested.

How do you weigh these two types of models in respect to the interpretation of back-transformed model-derived measures, i.e. their fixed-effects estimates? Is this an issue that guides your choice of statistical model in practice? How do you deal with interpreting fixed effects in mixed-models if the outcome variable has been transformed? Or don't you interpret these effects at all?

Many thanks for your thoughts, Lorenz
-
Lorenz Gygax
Federal Veterinary Office FVO
Centre for proper housing of ruminants and pigs
Tänikon, CH-8356 Ettenhausen / Switzerland