[R-sig-ME] Variances of higher-order vs. lower-order random terms

Chris Howden chris at trickysolutions.com.au
Wed Oct 16 00:44:30 CEST 2013


Could it be that lower order effects usually represent larger
differences between groups while interactions tend to 'tweak' things.

For example say we were predicting weight using gender and height.
Then height and gender would explain most of whats going on. But there
might be a very small gender*weight interaction.

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and
Innovation, Data Analysis, Modelling and Training

(mobile) 0410 689 945
(fax / office)
chris at trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are not
the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy,
use or disclose this communication or any attachments without our
consent. Although this email has been checked by anti-virus software,
there is a risk that email messages may be corrupted or infected by
viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the
company. Tricky Solutions always does our best to provide accurate
forecasts and analyses based on the data supplied, however it is
possible that some important predictors were not included in the data
sent to us. Information provided by us should not be solely relied
upon when making decisions and clients should use their own judgement.

> On 15 Oct 2013, at 19:11, Jake Westfall <jake987722 at hotmail.com> wrote:
>
> Hi everyone,
>
> **TL, DR summary:**
> Is there any theoretical or empirical basis to support the following statement being true as a general rule of thumb?
> "When estimating a mixed model, typically the estimated variances/standard deviations of random effects associated with 'higher-order' terms (e.g., random effects of two-way, three-way, and beyond interaction terms) turn out to be *smaller* than the estimated variances/standard deviations of random effects associated with 'lower-order' terms (e.g., the residual variance, variances associated with simple effects of grouping factors)."
>
> The source of this claim is me. ;)
>
> ****
>
> Okay, now for the longer version...
>
> Typically when I sit down to start analyzing a new dataset which I know will call for a mixed model, one of the first models that I fit (after the statistical foreplay of looking through the observations in the dataset, plotting various things, cross-tabulating different factors, etc.) is one that is pretty close to the "maximal" random effects specification, where every random effect that is in-principle possible to estimate from the data, is estimated.
>
> Naturally, it is not uncommon that this nearly-maximal model will have some computational problems (convergence errors, or wacky variance/covariance estimates, or etc.) and that I have to trim back this model to find one that my data can more easily support. Fine.
>
> In these situations, the method I have come to prefer for trimming random terms is not to rely on significance tests or likelihood ratios, but rather to just identify the random effects that seem to have the smallest standard deviations (which can admittedly be a little tricky when predictors are on very different scales, but I try to take account of this in my appraisal) and remove these terms first, sequentially in an iterative process. The idea being that I want to alter the predictions of the model as little as possible while still reducing the complexity of the model.
>
> One pattern that I seem to have noticed after a pretty good amount of time spent doing this is that following this method very often leads me to trim random effects associated with higher-order terms (as defined above) of the model first. This is not always true, and occasionally some of the higher-order terms explain a lot of variance, but this doesn't seem to be the general pattern. In sharp contrast, I usually find that lower-order random terms -- particularly those associated with simple effects of the grouping factors -- explain a pretty good amount of variance and are fairly essential to the model. At the extreme, the residual term commonly accounts for close to the most variance, although of course removing this term wouldn't be sensible.
>
> This entirely informal observation leads me to form the hypothesis that I stated at the beginning of this email.
>
> If it is true, then it constitutes a useful piece of advice that might be passed down to people who are less experience with this kind of model selection process. But before I begin doing so, I want to check with other, more experienced users about their reactions to this observation. Does it seem more or less true to you? Is it roughly consistent with your experience fitting many different mixed models to many different datasets? Do you know of any sensible, theoretical reasons why we might actually *expect* this to be true in a lot of cases? Or does it just seem like bullshit?
>
> One possible answer here is that it is not true even in my own case, and I have simply deceived myself. Certainly a possibility that I am open to.
>
> Another possibility is that it might be true in my own case, but that this could simply be a kind of coincidence having to do with the kinds of datasets that I tend to work with routinely (which, FYI, are datasets in psychological / social sciences, a slight majority being experimental in origin, but also a fair proportion of non-experimental stuff). If this is the case then there is probably no good reason for expecting my observations to hold in general in other fields that handle very different kinds of data. Still, if there is a coherent non-coincidental reason for why this might be expected to be true, even if only for these particular kinds of datasets, I would love to hear it.
>
> And of course another possibility is that others *have* noticed similar patterns in their own data, and that it represents some kind of general rule of thumb that people find useful to keep in mind as they fit mixed models to various different data. If this is the case then it seems like there must be some compelling statistical-theoretical reason for why this pattern arises. But I really don't know what that reason would look like.
>
> I welcome anyone's thoughts and opinions about this. Totally legitimate responses to this might be as simple as comments like "Yeah I have noticed something similar in the data I've worked with, but I have no idea why it should be true" or conversely "I have noticed nothing like this in the data I've worked with." Of course I also welcome longer and more involved discussions...
>
> FULL DISCLOSURE: I think I am probably also going to post this question to stats.stackexchange.com. If/when I do, I will send along the link to that question thread.
>
> Jake
>
>
>
>    [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list