[R-sig-ME] Yet another question about model specification and interpretation
Douglas Bates
bates at stat.wisc.edu
Sun Mar 16 14:09:32 CET 2008
On Sun, Mar 16, 2008 at 7:40 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:
> [snip]
> > Take a look at the deviances of the three models. To do a safer
> > comparison it would be better to refit all three models with method =
> > "ML" before comparing values like AIC and BIC but you can notice that
> > the deviance (both ML and REML) and the AIC and BIC of models (ii) and
> > (iii) are essentially identical, which is as it should be because the
> > only differences are in the parameterization of the random effects.
> > However, model (i), which has an additional parameter compared to
> > models (ii) and (iii), gives higher AIC, BIC and deviances. This is a
> > sign that something is going wrong in that model.
>
> Yes, I do understand that something goes wrong in the model lmer
> (score ~ Machine + (1 | Worker) + (0 + Machine | Worker)). Indeed the
> fit is better for the models lmer(score ~ Machine + (0 + Machine |
> Worker)) and lmer(score ~ Machine + (Machine | Worker)), which
> naturally are just reparametrizations of one another and therefore
> produce similar deviances, AICs, and BICs. However, if I do want an
> estimate of the Machine, the Worker and all interaction effects, how
> would I in a correct way specify such a model? I'm not really after
> the best model here, I'm just trying to understand and learn how to
> do things in lme4.
>
> Do I understand the model lmer(score ~ Machine + (0 + Machine |
> Worker)) correctly when I say that we don't get an estimate of the
> random effect of Worker, only estimates of the random interaction
> effect between Worker and Machine? If so, how can I get an estimate
> of the random Worker effect at the same time as estimating all the
> random interaction effects? If not, how do I from the output from lmer
> (score ~ Machine + (0 + Machine | Worker)) (see below) determine the
> random Worker effect?
I'm not sure if we have discussed it in this thread but there is
another model specification that may make more sense to you
lmer(score ~ Machine + (1|Worker) + (1|Worker:Machine), Machines)
or, equivalently,
lmer(score ~ Machine + (1|Worker/Machine), Machines)
This model allows for an additive random effect per worker and for an
additive random effect for the Worker:Machine interaction. It differs
from the models described earlier in the form of the
variance-covariance matrix for the interaction terms. In this model
the interaction terms have constant variance and are independent. In
the previous models the effects for each Worker-Machine combination
are allowed to be correlated within worker.
> ====================================
>
> Linear mixed-effects model fit by REML
> Formula: score ~ Machine + (0 + Machine | Worker)
> Data: Machines
> AIC BIC logLik MLdeviance REMLdeviance
> 226.3 244.2 -104.2 216.6 208.3
> Random effects:
> Groups Name Variance Std.Dev. Corr
> Worker MachineA 16.64234 4.07950
> MachineB 74.37306 8.62398 0.803
> MachineC 19.26441 4.38912 0.623 0.771
> Residual 0.92464 0.96158
> number of obs: 54, groups: Worker, 6
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 52.356 1.681 31.149
> MachineB 7.967 2.420 3.292
> MachineC 13.917 1.540 9.037
>
> Correlation of Fixed Effects:
> (Intr) MachnB
> MachineB 0.463
> MachineC -0.374 0.301
> ====================================
>
>
> Thank you!
>
> Best regards,
>
> Martin.
>
> P.S. I'm sorry to keep bugging you guys, but I need to understand
> this in order to make some sense of some experimental data. I really
> like the idea of R in general and lme4 in particular and in order to
> avoid using e.g. SPSS or some other commercial package I fully have
> to understand the model specifications and interpretations. For me,
> who are used to regression and the lm command family, this is a bit
> unfamiliar and tricky.
>
>
>
>
> > As I said in my private reply, fitting that model should fail but
> > apparently it doesn't. The software converges to a spurious result
> > and it takes a bit of digging to discover that the result is indeed
> > spurious. It would be better if the software weree to detect such
> > problems and indeed it is possible but difficult to do this. It is
> > also the case that all of the programming effort to do so and time
> > spent checking models for problems like this diverts from other
> > enhancements that could be made to the software.
> >
> > As an editorial comment, those who would like to have the ability to
> > impose correlation structures or variance functions, in addition to
> > the correlations generated by the random effects, on the conditional
> > distribution of the response given the random effects (i.e. make lmer
> > more like lme in the range of models that can be fit) may want to bear
> > this case of getting a spurious fit to a model that is not well
> > defined in mind. Surprisingly the majority of the effort in allowing
> > specifications like that is not in designing and implementing the data
> > structures and numerical algorithms. That part is comparatively easy.
> > The difficult part is deciding how the user should specify the model
> > and how the specification is translated into the structures and
> > algorithms. You may think, "But that's easy - just use the lme
> > specification" but that won't work. The nlme package was created for
> > S and S-PLUS then ported to R and it does not follow the set of steps
> > for model-fitting functions that is native to R. Most of the time
> > that is not a problem but every once in a while it breaks and it
> > breaks in ugly ways. Think of the recent discussion on this list
> > about getting fitted values and residuals that appropriately take
> > account of the pattern of missingness in the original data. How does
> > the information on the pattern of missingness get into the model frame
> > in the first place? How are the "subset", "na.action", "offset", etc.
> > arguments handled by model-fitting functions in R?
> >
> > These are subtle and difficult issues. Furthermore, the definition of
> > the models themselves must be done with care. It is not possible to
> > independently specify correlations, variance structures, conditional
> > distributions (in the sense of models for binomial or Poisson
> > responses) and arbitrary random effects structures. Certain
> > combinations don't make sense. It is actually stronger than that,
> > certain combinations are impossible. Witness the recent suggestion
> > that the lme4 package is somehow inferior because it doesn't allow
> > specification of "the" repeated measures model for a binary response.
> > Apparently "the" repeated measures model specifies a certain
> > conditional variance-covariance structure in the responses and the
> > fact that this structure is incompatible with the variances implied by
> > the binomial response should not prevent specification and fitting of
> > such a model. Mathematical impossibility is apparently a rather
> > flimsy excuse for not providing software that fits such a model.
> >
> > Later I'll write more on the issue of variance structures and
> > correlation structures. I plan to provide classes for extensions to
> > the mixed models (in fact if you look at the AllClasses.R file on the
> > archive you will see they are already there) and show how they could
> > be used to fit certain simple extensions, like a model with stratified
> > conditional variances. Then a person who wants to create software to
> > do a "one off" fit of a mixed model including various "mix-in"
> > specifications can do so. However, I don't plan on offering general
> > model specifications in lmer until I, or someone else, can work out
> > end-to-end how to do it without painting yourself into a corner where
> > impossible model specifications can be fit. And also I want it to be
> > done in an Rish way so that all the other model arguments work as they
> > should. (The main stumbling block there is that a call to model.frame
> > allows for only 1 formula and you must be very careful how you create
> > that formula. If you create an omnibus formula from a collection of
> > other formulas you have to decide what the environment of the omnibus
> > formula should be.)
>
>
>
> ========================================
> Martin Eklund
> PhD Student
> Department of Pharmaceutical Biosciences
> Uppsala University, Sweden
> Ph: +46-18-4714281
> ========================================
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list