[R-sig-ME] Random effect variance = zero

Fri Aug 15 09:16:00 CEST 2014

I have argued the issue of possible negative “variance” components previously.  
There are cases where the ability to fit a negative variance component will lead
to something like the correct variance-covariance matrix.  If that is what the data
say, then the software should allow them to make their voice heard, and they 
should be listened to.

Scientists who misunderstand what blocking is about will on occasion choose 
blocks that maximise within block variability, e.g., at right angles to a stream
rather than parallel to the stream.  

One needs to distinguish the case where there is no reason to choose any other
estimate than zero, and the case where, with the model as formulated, a negative
component of variance (e.g., a negative between block variance) is somewhere 
needed in order to get the variance-covariance matrix right.

I am talking about what happens on the way to a final meaningful (one hopes)
modelling of the data.  At the end of the day, one would prefer not to have to give
a meaning to negative variances!  

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On 15 Aug 2014, at 1:12, Douglas Bates <bates at stat.wisc.edu> wrote:

> On Thu, Aug 14, 2014 at 5:54 AM, Marco Plebani <marcoplebani85 at gmail.com>
> wrote:
> 
>> Dear list members,
>> 
>> Package blme has been suggested for fixing issues with random effect
>> variance = zero in other occasions, but I do not understand the rationale
>> behind it. What does blme that lme4 does not? In which way do the two
>> approaches differ? In particular:
>> 
> 
> I appreciate your asking these questions.  Personally, I don't regard
> random-effect variances being estimated as zero as an issue that needs to
> be "fixed".  To me this is a situation where the data indicate that the
> model should be simplified, in the sense that the random effects term can
> be removed without substantial loss of fidelity to the data.
> 
> The maximum likelihood or the REML criterion can be viewed as a trade-off
> between model simplicity and fidelity to the data.  In cases where the
> model converges to boundary conditions, such as a variance component
> estimate of zero, the simpler model does not fit significantly worse than
> the model with the random effects present.
> 
>> From the context of the data we may "know" (or at least expect) that there
> is variability between the groups that we should take into account.  An
> estimate of zero does not indicate that there is no variability in the
> groups; it indicates that there is not excess variability beyond what would
> be induced by the residual, or per-observation, variability.  If you
> simulate 50 observations, say from a Gaussian distribution, and arbitrarily
> divide them into 10 groups of 5 there will be variability in the means of
> these 10 groups, even though there was no explicit group-to-group
> variability added to the simulation.
> 
> An estimate of zero for a variance component is a property of the model,
> not of the underlying mechanism generating the data.  Remember George Box's
> famous statement that "All models are wrong; some models are useful."
> 
> This is why I feel uncomfortable with assigning a prior that will have the
> effect of pulling the estimate of a variance component away from zero. To
> me, this is overruling the data.  If the data do not contain sufficient
> information to distinguish between the model fits with and without the
> random effects then to me that indicates that you should report it as such.
> This doesn't mean that you have affirmed the null hypothesis of the
> between-group variance being zero.  It is much more likely that there is
> insufficient data to estimate the parameters in a model of this level of
> complexity.  Don't confuse absence of evidence with evidence of absence.
> 
> Estimation of variance and covariance components requires a large number of
> groups.  It is important to realize this.  It is also important to realize
> that in most cases you are not terribly interested in precise estimates of
> variance components.  Sometimes you are but a substantial portion of the
> time you are using  random effects to model subject-to-subject variability,
> etc. and if the data don't provide sufficient subject-to-subject
> variability to support the model then drop down to a simpler model.  This
> works in the case of a zero variance component; other cases with variances
> and covariances in which the covariance matrix has a singular estimate are
> more difficult to handle.
> 
> In a Bayesian framework the choice of prior can allow you to pull the
> parameter estimates away from uncomfortable values.  But it is this choice
> that I find uncomfortable.  Suppose I analyze data using one prior and
> reach some conclusions and then you analyze the same data with a different
> choice of prior and reach other conclusions.  Are our conclusions based on
> the data or on our prior beliefs?
> 
> In most cases this doesn't happen.  If the likelihood is much less diffuse
> than the prior then the posterior distribution is dominated by the data,
> not the prior. But it is exactly in the boundary cases that the likelihood
> is very diffuse and the information is coming from the prior, not the data.
> To me, this is a red flag.  Assigning a prior to pull back parameter
> estimates from problematic values is, in my opinion, overruling the data.
> 
> I feel that the choice of prior should be justified on grounds other than
> "it gives me the results that I want".  That is too harsh a criticism - no
> reputable investigator would do such a thing on purpose but they may do so
> by accident.  As with many concepts in statistics, the mathematics to
> investigate the properties of priors is subtle and difficult.  Box and Tiao
> in their book "Bayesian Inference in Statistical Analysis" appeal to the
> concept of "data translated likelihood" to justify a locally uniform prior
> on the logarithm of a variance.  This means that the prior pushes the
> estimate of a standard deviation or variance towards zero, not away from
> zero.
> 
> I do admit that I haven't kept up with the literature on Bayesian inference
> so there may be better justifications for prior distributions on variance
> components and covariance matrices for random effects.  I do think,
> however, that there should be some justification outside the context of the
> data for a choice of prior, especially in cases where the prior dominates
> the likelihood.  In practice this means the cases where the estimates are
> on the boundary or where the information on the variance components is very
> diffuse.  Unfortunately, those cases are more common than we would like.
> You must have a large number of groups before you can hope to have
> precision on the estimate of a single variance component.  You must have a
> very large number of groups before you can hope for precision of an
> estimate of a covariance matrix for random effects.
> 
> - what is the prior information that blme is using, and
>> - how comes that blme still estimates parameter values and assign p-values
>> to them? According to my (very limited) knowledge of bayesian stats the
>> outcome of the analysis should be an updated distribution of the possible
>> parameter values.
>> 
>> The available documentation about blme is limited and/or I could not find
>> it. I realize that my question on blme hides another, much broader, on how
>> bayesian stats work; regarding the latter, a suggestion of a good,
>> practice-oriented reference book would be appreciated.
>> 
>> Thank you in advance,
>> 
>> Marco
>> 
>> -----
>> Marco Plebani, PhD candidate (Ecology) at the University of Zurich
>> Institute of Evolutionary Biology and Environmental Studies
>> http://www.ieu.uzh.ch/staff/phd/plebani.html
>> 
>> On 13/ago/2014, at 12:00, r-sig-mixed-models-request at r-project.org wrote:
>> 
>>> Date: Tue, 12 Aug 2014 12:35:10 -0400
>>> Subject: Re: [R-sig-ME] Random effect variance = zero
>>> From: bbolker at gmail.com
>>> To: aurorepaligot at hotmail.com
>>> CC: r-sig-mixed-models at r-project.org
>>> 
>>> 
>>> Short answer: yes, very common outcome, especially with small numbers of
>> random effects groups (e.g. <5).  See http://glmm.wikidot.com/faq ; blme
>> package for 'regularizing' fits so this doesn't happen (at the expense of
>> changing the statistical model slightly); http://rpubs.com/bbolker/4187 .
>>> 
>>> 
>>> 
>>> On Tue, Aug 12, 2014 at 12:05 PM, Aurore Paligot <
>> aurorepaligot at hotmail.com> wrote:
>>> 
>>> Hello Everybody, I am new at using mixed models, and I would like some
>> advice about some results that I obtained and that seem counter-intuitive
>> to me.  As an output of a test, I obtainded a variance of zero for a random
>> factor.
>>> 
>>> […] How is it possible?  Can it be considered as a reasonable output?
>>> 
>>> I found this information about the variance estimates of zero. Could
>> this explanation apply to my study?
>>> 
>>> "It is possible to end up with a school variance estimate of zero. This
>> fact often puzzles the researcher since each school will most certainly not
>> have the same mean test result. An estimated among-school variance being
>> zero, however, does not mean that each school has the same mean, but rather
>> that the clustering of the students within schools does not help explain
>> any of the overall variability present in test results. In this case, test
>> results of students can be considered as all independent of each other
>> regardless if they are from the same school or not. "(
>> http://www.cscu.cornell.edu/news/statnews/stnews69.pdf )
>>> 
>>> If not, where could the problem come from? Is the formula that I used
>> correct? Is a mixed-model appropriate for this type of question?
>>> 
>>> I would really appreciate some clarification if someone already faced
>> this type of problem !
>>> 
>>> Best regards,
>>> 
>>> Aurore
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models