[R-sig-ME] variance components models with zero estimates

Sun Jun 8 18:33:31 CEST 2008

On 6/7/08, David Airey <david.airey at vanderbilt.edu> wrote:
> When a variance components mixed model is run in Stata, if some of the
> variance components are zero, the model may not converge, for rational
> reasons according to the manual entry. However, when the same model is run
> in SAS, the models with variance components that estimate to zero
> nonetheless converge. According to some SAS user friends, this is normal SAS
> behavior (I'm new to SAS as of yesterday). If I'm interested in looping
> through a set of such models, the SAS behavior is preferred. However, in
> Stata such models can be formulated as multilevel models that can
> dramatically reduce the dimension of the design matrix. The context where
> both behaviors is important is mixed models for gene set enrichment
> analysis, where there is a possibility of hundreds of models.

That's a very interesting topic and something that I am tempted to
answer in far too much detail.

Let me provide the brief answer.  Currently there are two versions of
the lme4 package available: a released version and a development
version.  To install the development version use

install.packages("lme4", repos = "http://r-forge.r-project.org")

The major difference between these versions is how they handle
evaluation of the log-likelihood when variance components are zero.
That is, does the log-likelihood get evaluated in a way that behaves
smoothly as a variance component goes to zero (and in the more general
situation of a variance-covariance matrix approaching singularity)?
The released version doesn't behave smoothly at zero and it is
necessary to restrict variances to being slightly greater than zero.
The development version does go to zero smoothly.

Gregor Gorjanc's response relates to the released version.  Neither
version throws an error condition when the estimates are zero or
near-zero.  However, if you are concerned about estimates that may be
zero I recommend using the development version of the lme4 package.
It is more robust and reliable at determining the parameter estimates.

(The natural question to ask is then why not release the development
version?  I have not done so because there are some support routines,
notably the Markov chain Monte Carlo sampler, whose versions in the
new formulation are incomplete, and some software supporting books
depends on those.)

>  Does R lme4 handle variance components mixed models that have estimates of
> zero for some of the variance components like SAS or Stata?

Neither.  The lmer function handles these situations correctly :-)

> Is it possible
> to loop through variance components models when some of the variance
> components are zero?

Yes.  Do bear in mind that much of the effort in fitting such models
is spent establishing the structures representing the model and data -
often more time will be spent formulating the numerical representation
of the model than is spent in optimizing the parameter estimates.  If
you find that your loops are taking a very long time to run you may
want to dig deeper into the code and see if you can by-pass some of
the steps and go directly to the optimization phase.  There is now a
function called "refit" in lme4 that can be used to go directly to the
optimization phase if the model and design are fixed and only response
has been changed.

> What is a suggested procedure for doing so in R? In
> Stata, I would probably use an EM only guess at which variance components
> were substantive, and then fit one of several models. What about R lme?

i think we would need more detail to be able to answer such a question.