[R-sig-ME] Question concerning the reduction of random effects

Sat Feb 4 19:23:57 CET 2017

Wow, that's a lot of random effects parameters to deal with.

You could use likelihood ratio tests to decide which nouns aren't having an
impact on model fit but with cross classified levels the order in which you
do the tests can matter if the nouns are correlated.

There are three basic ways to go about it when you have a lot of parameters.

 1) You can add a random effect and use an LR test against a model without
any other random intercepts and cycle your way through all of the groups.
This is a problem if groups are correlated because two random effects might
essentially be collinear so the test can be too generous with cross
classified random effects tested one at a time. It will provide you a
decent way to thin your initial list because it's the most conservative way
you can test impact on model fit of these three approaches. A random effect
shouldn't be rejected from this version of the test and show an impact on
the next version unless something is seriously wrong with your data, model,
or estimator.

2) you can use a leave one out approach where you fit every random
intercept except one and compare it to a full model with all random
intercepts using an LR test. Then cycle through the list like a jackknife.
My guess is you'll have problems with this because of the number of
parameters you have.  Try version 1 first to toss out random effects that
dont matter and then use this approach to test the remainders. Maybe you
can do some version of factor analysis to reduce the number of dimensions
in the model if you still have too many.  If nouns are correlated then just
combine them into factor loadings and do random effects with those
aggregations.

3) you can cycle through all possible combinations of random effects and do
bayesian model averaging to figure out what percentage of the possible
models a noun matters according to some model fit statistic. You can also
combine this version with version 1 or with factor analysis to do an
initial reduction in parameters and make estimation more manageable.

As a general caution you should be very certain that your random effects
approximations are accurate in this case. I'm guessing you're using some
variant of MCMC because gauss quadrature would fail miserably if it had to
estimate that many random effects. If you're trying to use PQL or a Laplace
approximation here then you are very likely to have problems with accuracy
which will bias the LR tests toward no effect. This will be more likely to
happen if your nouns are correlated because the omitted levels in version 1
will cause distortions from normality in the shape of the random effects
parameters you leave in the model.

Hope that makes sense.  I'm typing it out on my phone.

On Feb 4, 2017 12:30 PM, "Tibor Kiss" <tibor at linguistics.rub.de> wrote:

> Hi,
>
> perhaps I should add for greater clarity that I have calculated the values
> below for each level of the random effect "noun". Altogether, the random
> effect has a standard deviation of 3.85 on an intercept of 6.95.
>
> I have assumed that a level of the random effect does not have an effect
> if sqrt(level)*1,96 includes 0 (because then in a binary model, it can have
> a positive as well as a negative effect on the intercept). I am only
> interested in those levels that show a unique negative influence. Thus the
> level with the highest negative value (-7.93) may reduce the likelihood of
> a positive outcome from 99.9 to a mere 27.3 %.
>
> With kind regards
>
> Tibor
>
>
>
>
>
> Am 04.02.2017 um 17:22 schrieb Tibor Kiss <tibor at linguistics.rub.de>:
>
> > Hi John,
> >
> > no, I have extracted the conditional variances with ranef(MODEL, condVar
> = T) as well as attr(MODEL.randoms[[1]], "postVar"), determined the
> sqrt(variances)*1,96 and calculated whether abs(intercept) -
> sqrt(variance)*1,96 > 0.
> >
> > I treat nouns only as random effects, since they are sampled from an
> infinite population.
> >
> > With kind regards
> >
> > Tibor
> >
> >
> >
> >
> >
> > Am 04.02.2017 um 17:10 schrieb Poe, John <jdpo223 at g.uky.edu>:
> >
> >> Tibor,
> >>
> >> If I understand you correctly you've included each group as a fixed
> effect to get the confidence intervals and done a an enormous number of
> hypotheses tests. If that's the case you really can't trust the results.
> That many categorical fixed effects for a nonlinear outcome will produce
> biased coefficients and standard errors. Even with as few as ten groups you
> start to see bias. So just because something has zero in the CI on a group
> fixed effect doesn't mean that the group does not, in reality,  have a
> significant mean difference from the population average.
> >>
> >> On Feb 4, 2017 10:38 AM, "Tibor Kiss" <tibor at linguistics.rub.de> wrote:
> >> Hello everybody,
> >>
> >> I have a question that might perhaps sound weird, but I haven't found
> anything on this yet, so maybe someone can enlighten me here.
> >>
> >> I am working with a BGLMM (random intercept) that contains the random
> effect "noun" (basically, a noun occurring in a natural language sentence
> from a sample) with 1.302 levels. The high number of levels is due to the
> fact that the sample contains this many different nouns in the relevant
> position of the clause.
> >>
> >> After determining the variances of the individual 1.302 nouns, there
> are only 54 nouns left which do not contain 0 in the 95 % confidence
> interval. So, these are nouns that are actually interesting. I have a hunch
> now that this reduced number of nouns also influences some of the slopes,
> but I cannot test this. The dataset contains only 2.284 observations. Thus,
> the total number of random effects is larger than the number of
> observations when tested against a binary feature.
> >>
> >> I would like to find a way to reduce the random effects to the ones
> which have shown relevant in the random intercept model, and would like to
> use the reduced set for a random slope model. It strikes me that this is
> tampering with the data, unless there is a principled way of selecting a
> subset from the set of random effects. So, if there is a principled way, I
> would appreciate learning about it.
> >>
> >>
> >>
> >> With kind regards
> >>
> >> Tibor
> >>
> >>
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]