[R-sig-ME] Yet another question about model specification and interpretation

Sun Mar 16 14:09:32 CET 2008

On Sun, Mar 16, 2008 at 7:40 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:

>  [snip]

>  > Take a look at the deviances of the three models.  To do a safer
>  > comparison it would be better to refit all three models with method =
>  > "ML" before comparing values like AIC and BIC but you can notice that
>  > the deviance (both ML and REML) and the AIC and BIC of models (ii) and
>  > (iii) are essentially identical, which is as it should be because the
>  > only differences are in the parameterization of the random effects.
>  > However, model (i), which has an additional parameter compared to
>  > models (ii) and (iii), gives higher AIC, BIC and deviances.  This is a
>  > sign that something is going wrong in that model.
>
>  Yes, I do understand that something goes wrong in the model lmer
>  (score ~ Machine + (1 | Worker) + (0 + Machine | Worker)). Indeed the
>  fit is better for the models lmer(score ~ Machine + (0 + Machine |
>  Worker)) and lmer(score ~ Machine + (Machine | Worker)), which
>  naturally are just reparametrizations of one another and therefore
>  produce similar deviances, AICs, and BICs. However, if I do want an
>  estimate of the Machine, the Worker and all interaction effects, how
>  would I in a correct way specify such a model? I'm not really after
>  the best model here, I'm just trying to understand and learn how to
>  do things in lme4.
>
>  Do I understand the model lmer(score ~ Machine + (0 + Machine |
>  Worker)) correctly when I say that we don't get an estimate of the
>  random effect of Worker, only estimates of the random interaction
>  effect between Worker and Machine? If so, how can I get an estimate
>  of the random Worker effect at the same time as estimating all the
>  random interaction effects? If not, how do I from the output from lmer
>  (score ~ Machine + (0 + Machine | Worker)) (see below) determine the
>  random Worker effect?

I'm not sure if we have discussed it in this thread but there is
another model specification that may make more sense to you

lmer(score ~ Machine + (1|Worker) + (1|Worker:Machine), Machines)

or, equivalently,

lmer(score ~ Machine + (1|Worker/Machine), Machines)

This model allows for an additive random effect per worker and for an
additive random effect for the Worker:Machine interaction.  It differs
from the models described earlier in the form of the
variance-covariance matrix for the interaction terms.  In this model
the interaction terms have constant variance and are independent.  In
the previous models the effects for each Worker-Machine combination
are allowed to be correlated within worker.

>  ====================================
>
> Linear mixed-effects model fit by REML
>  Formula: score ~ Machine + (0 + Machine | Worker)
>     Data: Machines
>     AIC   BIC logLik MLdeviance REMLdeviance
>   226.3 244.2 -104.2      216.6        208.3
>  Random effects:
>   Groups   Name     Variance Std.Dev. Corr
>   Worker   MachineA 16.64234 4.07950
>            MachineB 74.37306 8.62398  0.803
>            MachineC 19.26441 4.38912  0.623 0.771
>   Residual           0.92464 0.96158
>  number of obs: 54, groups: Worker, 6
>
>  Fixed effects:
>              Estimate Std. Error t value
>  (Intercept)   52.356      1.681  31.149
>  MachineB       7.967      2.420   3.292
>  MachineC      13.917      1.540   9.037
>
>  Correlation of Fixed Effects:
>           (Intr) MachnB
>  MachineB  0.463
>  MachineC -0.374  0.301
>  ====================================
>
>
>  Thank you!
>
>  Best regards,
>
>  Martin.
>
>  P.S. I'm sorry to keep bugging you guys, but I need to understand
>  this in order to make some sense of some experimental data. I really
>  like the idea of R in general and lme4 in particular and in order to
>  avoid using e.g. SPSS or some other commercial package I fully have
>  to understand the model specifications and interpretations. For me,
>  who are used to regression and the lm command family, this is a bit
>  unfamiliar and tricky.
>
>
>
>
>  > As I said in my private reply, fitting that model should fail but
>  > apparently it doesn't.  The software converges to a spurious result
>  > and it takes a bit of digging to discover that the result is indeed
>  > spurious.  It would be better if the software weree to detect such
>  > problems and indeed it is possible but difficult to do this. It is
>  > also the case that all of the programming effort to do so and time
>  > spent checking models for problems like this diverts from other
>  > enhancements that could be made to the software.
>  >
>  > As an editorial comment, those who would like to have the ability to
>  > impose correlation structures or variance functions, in addition to
>  > the correlations generated by the random effects, on the conditional
>  > distribution of the response given the random effects (i.e. make lmer
>  > more like lme in the range of models that can be fit) may want to bear
>  > this case of getting a spurious fit to a model that is not well
>  > defined in mind.  Surprisingly the majority of the effort in allowing
>  > specifications like that is not in designing and implementing the data
>  > structures and numerical algorithms.  That part is comparatively easy.
>  >  The difficult part is deciding how the user should specify the model
>  > and how the specification is translated into the structures and
>  > algorithms.  You may think, "But that's easy - just use the lme
>  > specification" but that won't work.  The nlme package was created for
>  > S and S-PLUS then ported to R and it does not follow the set of steps
>  > for model-fitting functions that is native to R.  Most of the time
>  > that is not a problem but every once in a while it breaks and it
>  > breaks in ugly ways.  Think of the recent discussion on this list
>  > about getting fitted values and residuals that appropriately take
>  > account of the pattern of missingness in the original data.  How does
>  > the information on the pattern of missingness get into the model frame
>  > in the first place?  How are the "subset", "na.action", "offset", etc.
>  > arguments handled by model-fitting functions in R?
>  >
>  > These are subtle and difficult issues.  Furthermore, the definition of
>  > the models themselves must be done with care.  It is not possible to
>  > independently specify correlations, variance structures, conditional
>  > distributions (in the sense of models for binomial or Poisson
>  > responses) and arbitrary random effects structures.  Certain
>  > combinations don't make sense.  It is actually stronger than that,
>  > certain combinations are impossible.  Witness the recent suggestion
>  > that the lme4 package is somehow inferior because it doesn't allow
>  > specification of "the" repeated measures model for a binary response.
>  > Apparently "the" repeated measures model specifies a certain
>  > conditional variance-covariance structure in the responses and the
>  > fact that this structure is incompatible with the variances implied by
>  > the binomial response should not prevent specification and fitting of
>  > such a model.  Mathematical impossibility is apparently a rather
>  > flimsy excuse for not providing software that fits such a model.
>  >
>  > Later I'll write more on the issue of variance structures and
>  > correlation structures.  I plan to provide classes for extensions to
>  > the mixed models (in fact if you look at the AllClasses.R file on the
>  > archive you will see they are already there) and show how they could
>  > be used to fit certain simple extensions, like a model with stratified
>  > conditional variances.  Then a person who wants to create software to
>  > do a "one off" fit of a mixed model  including various "mix-in"
>  > specifications can do so.  However, I don't plan on offering general
>  > model specifications in lmer until I, or someone else, can work out
>  > end-to-end how to do it without painting yourself into a corner where
>  > impossible model specifications can be fit.  And also I want it to be
>  > done in an Rish way so that all the other model arguments work as they
>  > should.  (The main stumbling block there is that a call to model.frame
>  > allows for only 1 formula and you must be very careful how you create
>  > that formula.  If you create an omnibus formula from a collection of
>  > other formulas you have to decide what the environment of the omnibus
>  > formula should be.)
>
>
>
> ========================================
>  Martin Eklund
>  PhD Student
>  Department of Pharmaceutical Biosciences
>  Uppsala University, Sweden
>  Ph: +46-18-4714281
>  ========================================
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
>  _______________________________________________
>  R-sig-mixed-models at r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>