[R-sig-ME] How to assess significance of variance components (Please discard previous e mail... but read this one)

Thu Oct 18 19:02:01 CEST 2012

First you should ask if you really need to test the variance
component.  sometimes we get in the rut of feeling that we need to
test everything, but if knowledge of the science suggested that you
include a random effect then that is probably more important than a
test and it should be left in.  Even if the variance component is
small it does not hurt to leave it in.

If the test on the variance component is still of interest then there
are a couple of options:

The ideal, theoretical option using Bayesian statistics is to use a
prior on the variance component with a point mass at 0, then the
posterior would also have a point mass at 0 and you could see how much
of the probability is there to decide (and confidence intervals may
then include or exclude 0 accordingly).  However, this is difficult on
a practical level since I don't know of any of the tools that make
Bayesian stats easy to do that are capable of dealing with priors
including a point mass.  You can do it by hand for simple problems,
but probably not for a mixed model approach.  A similar alternative is
to decide on a value greater than 0 that would still be considered
"practically" 0, i.e. any value of the variance component less than
that value will be considered to be 0.  Then you can create a prior
distribution that puts a fair amount of probability in the range from
0 to your bound (probably a mixture of 2 distributions), then if the
posterior CI is entirely below your bound you can treat the component
as having a value of 0, and this type of prior can be coded into
Bayesian tools.

A frequentist option would be to use a permutation test.  Under the
null hypothesis that the variance component is 0 the grouping for that
random effect is random rather than meaningful, so just permute the
group memberships and refit the model a bunch of times and see what
the distribution of the variance component is.  Then compare the
original variance component to this distribution.  Exactly how to do
the permutations will depend on the structure of the data and the
model being fit.  I think that I would prefer this method.

It might be fun to try and find a way to combine the Bayesian with the
permutation test, though there are probably people in both camps that
would call it heresy.

On Thu, Oct 18, 2012 at 3:52 AM,  <chantepie at mnhn.fr> wrote:
> Dear all,
>
> My question could appear trivial but I still have not found a clear answer.
>
> From what I gathered, the Bayesian framework gives us two possible tools to
> estimate significance of variances, the DIC and Confidence Interval (CI)
> estimates
>
> DIC allows to compare models and test for significance of variances. Some
> papers mention that this approach is valid for all the exponential family
> distribution models, to the extent that it should even allow to test which
> distribution fits better the data. However, in a previous post Jarrod
> mentioned that DIC does not always answer the hypothesis we want to test and
> finished by saying that for non gaussian distribution, he?d never use DIC.
> And actually, when running some animal models with Poisson distribution I
> encountered strange results suggesting that DIC does not work at all. The
> lower CI of Va estimates are clearly greater than 0 which let me think that
> Va is different from 0 but DIC does not give substantial support for models
> with Va
> I understand that assessing confidence interval is the great advantage of
> Bayesian models. But as variances range between [0:1], it is not possible to
> construct a Pvalue by counting the proportion of estimate below 0 (as we
> could do with covariances). When the lower CI of a variance is far from 0,
> it is quite easy to be sure that this variance is different from 0 but when
> the posterior modes are small and lower CI are close to 0, how can we
> decide? One approach could be to check whether the posterior mode is well
> defined or whether it ?collapses? on zero but would that be enough?
>
>
> I am aware that Bayesian statistics are not frequentist statistics and
> statistical tools are different but, a clear decision rule would be helpful.
>
> It would be most helpful to know your thoughts about this and whether there
> are other decision rules that could be applied.
>
> Thanks to all
>
> Stephane
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com