[R-sig-ME] R-sig-mixed-models Digest, Vol 10, Issue 16

Fri Oct 19 12:03:59 CEST 2007

Dear friends,
I had sent the following email:
Dear friends,
I am using lmer to run a binomial multilevel model. However, it takes approximately 8 hours to run the model. I was forced to download MLWin and I tried it. It takes only 3 to 4 minutes to converge. Am I doing something wrong with lmer? This is the model I try to fit. 

m1 <- lmer(score ~ 1+item+(1|final_id), data=malt, family=binomial,method="PQL", 
+ control=list(gradient = FALSE, msVerbose=TRUE))

I also attach the dataset in case somebody want to replicate my model.

However, I received no responses, does anyone have an answer to my question?

jason

Dr. Iasonas Lamprianou
Department of Education
The University of Manchester
Oxford Road, Manchester M13 9PL, UK
Tel. 0044 161 275 3485
iasonas.lamprianou at manchester.ac.uk

----- Original Message ----
From: "r-sig-mixed-models-request at r-project.org" <r-sig-mixed-models-request at r-project.org>
To: r-sig-mixed-models at r-project.org
Sent: Friday, 19 October, 2007 11:44:28 AM
Subject: R-sig-mixed-models Digest, Vol 10, Issue 16

Send R-sig-mixed-models mailing list submissions to
    r-sig-mixed-models at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
    https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
    r-sig-mixed-models-request at r-project.org

You can reach the person managing the list at
    r-sig-mixed-models-owner at r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."

Today's Topics:

  1.  [R] coef se in lme (dave fournier)
  2. Re: [R] coef se in lme (Douglas Bates)
  3.  [R] coef se in lme (dave fournier)
  4. Re: [R] coef se in lme (David Duffy)
  5. Re: change in variance components depending on scaling    of
      fixed effects (Martin Maechler)
  6. lmer() producing wildly inaccurate values for some GLM    models
      (Nathaniel Smith)

----------------------------------------------------------------------

Message: 1
Date: Thu, 18 Oct 2007 22:32:30 -0700
From: dave fournier <otter at otter-rsch.com>
Subject: [R-sig-ME]  [R] coef se in lme
To: r-sig-mixed-models at r-project.org
Message-ID: <4718416E.6010002 at otter-rsch.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

The AD Model Builder method I posted earlier takes into account the
uncertainty in the mean and both std deviations in Harold's simple model.

        Y_ij - mu + u_i +eps_ij

To illustrate this
I built a little model and simulated a data set with 1<=i<=10, 1<=j<=5 
observations.  Below are the parameter estiamtes tigether with their
estimated std devs.  The true values were mu=3.0 sigma_u=2.0 and sigma=3.0

>    1  mu          2.4091e+00 7.4307e-01
>    2  log_sigma_u  6.3908e-01 3.4913e-01
>    3  log_sigma    1.1354e+00 1.1180e-01
>    4  u          -3.1695e-01 6.4561e-01
>    5  u            2.0441e-01 6.4496e-01
>    6  u            5.1996e-01 6.4748e-01
>    7  u            4.3754e-01 6.4661e-01
>    8  u            4.8381e-01 6.4708e-01
>    9  u            1.2214e+00 6.6075e-01
>    10  u          -1.7631e+00 6.7793e-01
>    11  u            3.3987e-01 6.4578e-01
>    12  u          -1.7663e-01 6.4485e-01
>    13  u          -9.5030e-01 6.5439e-01

then I fixed log_sigma_u and log_sigma at their
estimated values and obtained.
> # fixed sd's
>  index  name  value      std dev
>      1  mu  2.4093e+00 7.4307e-01
>      2  u  -3.1736e-01 6.4407e-01
>      3  u  2.0456e-01 6.4407e-01
>      4  u  5.2045e-01 6.4407e-01
>      5  u  4.3794e-01 6.4407e-01
>      6  u  4.8426e-01 6.4407e-01
>      7  u  1.2226e+00 6.4407e-01
>      8  u  -1.7650e+00 6.4407e-01
>      9  u  3.4016e-01 6.4407e-01
>    10  u  -1.7689e-01 6.4407e-01
>    11  u  -9.5138e-01 6.4407e-01
and finally I almost fixed mu
(can't fix it completley because then there would be
no "fixed" effects and the model thinks there is
nothing to do.) and obtained

> # all fixed
>  index  name  value      std dev
>      1  mu  2.4093e+00 2.2361e-03
>      2  u  -3.1736e-01 5.9145e-01
>      3  u  2.0456e-01 5.9145e-01
>      4  u  5.2045e-01 5.9145e-01
>      5  u  4.3794e-01 5.9145e-01
>      6  u  4.8426e-01 5.9145e-01
>      7  u  1.2226e+00 5.9145e-01
>      8  u  -1.7650e+00 5.9145e-01
>      9  u  3.4016e-01 5.9145e-01
>    10  u  -1.7689e-01 5.9145e-01
>    11  u  -9.5138e-01 5.9145e-01
So for example u(1) the first random effect has estimated std devs

  6.4561e-01, 6.4407e-01, and 5.9145e-01

under the three models. It appears that most of the "extra" uncertainty 
in u(1) comes from the uncertainty in mu.

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

------------------------------

Message: 2
Date: Thu, 18 Oct 2007 17:03:05 -0500
From: "Douglas Bates" <bates at stat.wisc.edu>
Subject: Re: [R-sig-ME] [R] coef se in lme
To: davef at otter-rsch.com
Cc: r-sig-mixed-models at r-project.org
Message-ID:
    <40e66e0b0710181503x6c3fec49s509d683c953b61e8 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On 10/18/07, dave fournier <otter at otter-rsch.com> wrote:

> Here is one approach to this problem.

> In the AD Model Builder Random Effects package we provide estimated
> standard deviations for any function of the fixed and random effects,
> (here I include the parameters which detemine the covarince matrices if
> present) and the random effects. This is for general nonlinear random
> effects models, but the calculations can be used for linear models as
> well. We calculate these estimates as follows. Let L(x,u)
> be the log-likelihood function for the parameters x and u given the
> observed data,
> where u is the vector of random effects and x is the vector of the other
> parameters.

I know it may sound pedantic but I don't know what a log-likelihood
L(x,u) would be because you are treating parameters and the random
effects as if they are the same type of object and they're not.  If
you want to use a Bayesian approach you can kind of level the playing
field and say that everything is a parameter except for the observed
data values.  However, Bayesians also need to come up with a prior and
that isn't trivial in this case, as I tried to indicate in my message
about the mcmcsamp chain taking excursions.

I find I can easily confuse myself in the theory of the maximum
likelihood or REML estimates if I am not careful about the terminology
and the roles of the different coefficients in the linear predictor.
I think I would call that function L(x,u) the conditional density of
the random effects given the data.  The parameters determine the joint
density for the random effects and responses so plugging the observed
values of the responses into this expression yields the conditional
density of the random effects.

> Let F(x) be the log-likelihood for x after the u have been
> integrated out. This integration might be exact or more commonly via the
> Laplace approximation or something else.
> For any x let uhat(x) be the value of u which maximizes L(x,u),

I think that is what I would call the conditional modes of the random
effects.  These depend on the observed responses and the model
parameters.

> and let xhat be the value of x which maximizes F(x).

> The estimate for the covariance matrix for the x is then
> S_xx = inv(F_xx) and the estimated full covariance matrix Sigma for the
> x and u is given by

> S_xx                S_xx * uhat_x
> (S_xx * uhat_x)' uhat' * S_xx * uhat_x + inv(L_uu)

> where ' denotes transpose _x denotes first derivative wrt x (note that
> uhat is a function of x so that uhat_x makes sense) and _xx _uu denote
> the second derivatives wrt x and u. we then use Sigma and the delta
> method to estimate the standard deviation of any (differentiable)
> function of x and u.

I'm getting a little bit lost here.  In the example you sent based on
Harold's discussion, the dimension of x is 3 and the dimension of u is
10 so Sigma is a 13 by 13 matrix, right?  S_xx is 3 by 3 and L_uu is
10 by 10.  To form the product S_xx*uhat_x I think that uhat_x needs
to be 3 by 10.  Is that right?  (I'm used to writing the Jacobian of a
vector-valued function of a vector the other way around.)

It looks like you are missing a _x in the first term in "uhat' * S_xx * uhat_x".

To evaluate L_uu you need a value of x.  I assume you will use the
parameter estimates. Correct?

Will the parameterization of x affect the result?  If I write the
model in terms of the logarithms of the variances instead of the
variances I will definitely get a different Sigma but will the result
for a linear combination of mu and some of the u's stay invariant?  If
it isn't invariant, how do I choose the parameterization of the
variance components?

Can you give a bit more detail on how you justify mixing derivatives
of the marginal log-likelihood (F) with derivatives of the conditional
density (L).  Do you know that these are on the same scale?  I'm
willing to believe that they are - it is just that I can't see right
off why they should be.

------------------------------

Message: 3
Date: Fri, 19 Oct 2007 03:10:27 -0700
From: dave fournier <otter at otter-rsch.com>
Subject: [R-sig-ME]  [R] coef se in lme
To: r-sig-mixed-models at r-project.org
Message-ID: <47188293.9070502 at otter-rsch.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

ouglas Bates wrote:
> On 10/18/07, dave fournier <otter at otter-rsch.com> wrote:
>
>
>> Here is one approach to this problem.
>>
>
>
>> In the AD Model Builder Random Effects package we provide estimated
>> standard deviations for any function of the fixed and random effects,
>> (here I include the parameters which detemine the covarince matrices if
>> present) and the random effects. This is for general nonlinear random
>> effects models, but the calculations can be used for linear models as
>> well. We calculate these estimates as follows. Let L(x,u)
>> be the log-likelihood function for the parameters x and u given the
>> observed data,
>> where u is the vector of random effects and x is the vector of the other
>> parameters.
>>
>
> I know it may sound pedantic but I don't know what a log-likelihood
> L(x,u) would be because you are treating parameters and the random
> effects as if they are the same type of object and they're not.  If
> you want to use a Bayesian approach you can kind of level the playing
> field and say that everything is a parameter except for the observed
> data values.  However, Bayesians also need to come up with a prior and
> that isn't trivial in this case, as I tried to indicate in my message
> about the mcmcsamp chain taking excursions.
>
> I find I can easily confuse myself in the theory of the maximum
> likelihood or REML estimates if I am not careful about the terminology
> and the roles of the different coefficients in the linear predictor.
> I think I would call that function L(x,u) the conditional density of
> the random effects given the data.  The parameters determine the joint
> density for the random effects and responses so plugging the observed
> values of the responses into this expression yields the conditional
> density of the random effects
>
OK we will call them that.

>> Let F(x) be the log-likelihood for x after the u have been
>> integrated out. This integration might be exact or more commonly via the
>> Laplace approximation or something else.
>> For any x let uhat(x) be the value of u which maximizes L(x,u),
>>
>
> I think that is what I would call the conditional modes of the random
> effects.  These depend on the observed responses and the model
> parameters.
>

  That's right.
>
>> and let xhat be the value of x which maximizes F(x).
>>
>
>
>> The estimate for the covariance matrix for the x is then
>> S_xx = inv(F_xx) and the estimated full covariance matrix Sigma for the
>> x and u is given by
>>
>
>
>> S_xx                S_xx * uhat_x
>> (S_xx * uhat_x)' uhat' * S_xx * uhat_x + inv(L_uu)
>>
>
>
>> where ' denotes transpose _x denotes first derivative wrt x (note that
>> uhat is a function of x so that uhat_x makes sense) and _xx _uu denote
>> the second derivatives wrt x and u. we then use Sigma and the delta
>> method to estimate the standard deviation of any (differentiable)
>> function of x and u.
>>
>
> I'm getting a little bit lost here.  In the example you sent based on
> Harold's discussion, the dimension of x is 3 and the dimension of u is
> 10 so Sigma is a 13 by 13 matrix,
that is right.

> right?  S_xx is 3 by 3 and L_uu is
> 10 by 10.  To form the product S_xx*uhat_x I think that uhat_x needs
> to be 3 by 10.  Is that right?  (I'm used to writing the Jacobian of a
> vector-valued function of a vector the other way around.)
>
Yes it is either 3x10 or 10x3.  Since S_xx is 10 by 10 if one wirtes 
S_xx * uhat_x  uhat_x must be 10 by 3
i.e 10 rows and 3 columns.

> It looks like you are missing a _x in the first term in "uhat' * S_xx 
* uhat_x".
>
>
That is right it should be  uhat_x' * S_xx * uhat_x
> To evaluate L_uu you need a value of x.  I assume you will use the
> parameter estimates. Correct?
>
>
Yes the derivatives are evaluated at  xhat,uhat.

> Will the parameterization of x affect the result?
No.    supposte that  we have a reparamseterization  of the x's given by

  x=g(y)

  then wrt y the function is  F(g(y)) so the Hessian wrt y is

                    g_y' * F_xx * g_y

This is due to the fact that F_x=0.  Taking the inverse gives

                inv( g_y) * inv(F_xx) * inv(g_y')

so that when you do the delta method the g_y matrices cancel out.
> If I write the
> model in terms of the logarithms of the variances instead of the
> variances I will definitely get a different Sigma
Yes but if you then use to delta method to construct the covariance 
matrix for the depdent variables
exp(log_sigmas) you will recover the original. This is just the chain rule.

> but will the result
> for a linear combination of mu and some of the u's stay invariant?   If
> it isn't invariant, how do I choose the parameterization of the
> variance components?
>
Same as above. the whole thing is independent of how one parameterizes 
the x's.
> Can you give a bit more detail on how you justify mixing derivatives
> of the marginal log-likelihood (F) with derivatives of the conditional
> density (L).  Do you know that these are on the same scale?  I'm
> willing to believe that they are - it is just that I can't see right
> off why they should be.
>
>
Think of it this way.  You can use the marginal likelihood Hessian F_xx 
to construct the estimated covariance matrix
for the independent variables x. Then via the delta  method you 
calculate the covariance matrix for the dependent variables
uhat(x). So far this is just pure frequentist stuff. Now if you know 
what that uhat actually are inv(L_uu) is an estimate for the
variance of the u's.  So a simple estimate of the overall variance of 
the u's is to add the two together. So
uhat_x' * S * uhat_x is the extra variance in the u's introduced by the 
uncertainty in the true value of the uhat
due to uncertainty in the true value of the x's.

-- 
David A. Fournier
P.O. Box 2040, Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

------------------------------

Message: 4
Date: Fri, 19 Oct 2007 12:06:37 +1000 (EST)
From: David Duffy <David.Duffy at qimr.edu.au>
Subject: Re: [R-sig-ME] [R] coef se in lme
To: Douglas Bates <bates at stat.wisc.edu>
Cc: r-sig-mixed-models at r-project.org, davef at otter-rsch.com
Message-ID: <Pine.LNX.4.64.0710191058240.4460 at orpheus.qimr.edu.au>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Thu, 18 Oct 2007, Douglas Bates wrote:

> On 10/18/07, dave fournier <otter at otter-rsch.com> wrote:
>
>> In the AD Model Builder Random Effects package we provide estimated
>> standard deviations for any function of the fixed and random effects,
>> (here I include the parameters which detemine the covarince matrices if
>> present) and the random effects. This is for general nonlinear random
>> effects models, but the calculations can be used for linear models as
>> well. We calculate these estimates as follows. Let L(x,u)
>> be the log-likelihood function for the parameters x and u given the
>> observed data,
>> where u is the vector of random effects and x is the vector of the other
>> parameters.
>
> I know it may sound pedantic but I don't know what a log-likelihood
> L(x,u) would be because you are treating parameters and the random
> effects as if they are the same type of object and they're not.  If
>
>> Let F(x) be the log-likelihood for x after the u have been
>> integrated out. This integration might be exact or more commonly via the
>> Laplace approximation or something else.
>> For any x let uhat(x) be the value of u which maximizes L(x,u),
>
> I think that is what I would call the conditional modes of the random
> effects.  These depend on the observed responses and the model
> parameters.
>
>> and let xhat be the value of x which maximizes F(x).
>
>> The estimate for the covariance matrix for the x is then
>> S_xx = inv(F_xx) and the estimated full covariance matrix Sigma for the
>> x and u is given by
>
>> S_xx                S_xx * uhat_x
>> (S_xx * uhat_x)' uhat' * S_xx * uhat_x + inv(L_uu)
>
>> where ' denotes transpose _x denotes first derivative wrt x (note that
>> uhat is a function of x so that uhat_x makes sense) and _xx _uu denote
>> the second derivatives wrt x and u. we then use Sigma and the delta
>> method to estimate the standard deviation of any (differentiable)
>> function of x and u.
>
[Snip]
>
> Can you give a bit more detail on how you justify mixing derivatives
> of the marginal log-likelihood (F) with derivatives of the conditional
> density (L).  Do you know that these are on the same scale?  I'm
> willing to believe that they are - it is just that I can't see right
> off why they should be.
>

I find all of this is a bit above my head, but I do have a paper by Matt 
Wand _Fisher information for generalised linear mixed models_
Journal of Multivariate Analysis, (2007), 98, 1412-1416.

This looks at the canonical-link GLMM where the random effects are
Gaussian for a simple one-level/random intercepts model, and
ends rather abruptly ;) with

"Remark 2. Approximate standard errors for the maximum likelihood
estimates beta-hat and sigma-squared-hat can be obtained from the
diagonal entries of I(b-hat,s-hat2)^-1. However, as pointed out in Remark 1,
implementation is often hindered by intractable multivariate integrals.
Additionally, dependence among the entries of y induced by u means that
central limit theorems of the type: I (bhat,shat2)^-1 {(bhat,shat2)-(b,s2)
2)} converges in distribution to a N (0, I) random vector, have not been
established in general and, hence, interpretation of standard errors
is cloudy. Nevertheless, there are many special cases, such as
m-dependence when the data are from a longitudinal study, for which
central limit theorems can be established."

You'll have to read the paper which gives the derivation and formulae.

David Duffy.
-- 
| David Duffy (MBBS PhD)                                        ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /    *
| Epidemiology Unit, Queensland Institute of Medical Research  \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v

------------------------------

Message: 5
Date: Fri, 19 Oct 2007 09:23:10 +0200
From: Martin Maechler <maechler at stat.math.ethz.ch>
Subject: Re: [R-sig-ME] change in variance components depending on
    scaling    of fixed effects
To: Arild Husby <arild.husby at ed.ac.uk>
Cc: R-SIG-Mixed-Models <r-sig-mixed-models at r-project.org>
Message-ID: <18200.23390.616219.292409 at stat.math.ethz.ch>
Content-Type: text/plain; charset=us-ascii

>>>>> "AH" == Arild Husby <arild.husby at ed.ac.uk>
>>>>>    on Thu, 18 Oct 2007 09:53:19 +0100 writes:

    AH> Dear all,
    AH> I am trying to understand the output from a binomial lmer object and why 
    AH> the scaling of a fixed effect changes the variance components.

[....................... ]

    AH> All help is highly appreciated.

    AH> Thanks very much,
    AH> Arild Husby

    >> sessionInfo()
    AH> R version 2.5.0 (2007-04-23)
                -----
  [.......]

    AH> attached base packages:
    AH> [1] "stats"    "graphics"  "grDevices" "utils"    "datasets"   
    AH>    "methods"  "base"  
    AH> other attached packages:
    AH> lme4      Matrix    lattice
    AH> "0.99875-2" "0.99875-2"  "0.15-11"
        ---------

Can you  upgrade to R 2.5.1 
    (at least! --- always do upgrade from x.y.0 to x.y.1  !!! )
or better 2.6.0 and then
    upgrade.packages() ?
This will upgrade your lme4 to version  0.99875-8 (important!) 
and Matrix to 0.999375-3 {also possibly quite important}

Also, (and even alternatively to upgrading) 
if you provided a  *reproducible*  example
[e.g. you could dput(.) part of your data frame, or better
save() it and put the file up for downloading; or use
set.seed() and create random data which exhibits the same "feature"]
"your audience" could at least see if the current versions of
lme4 give the same results or not..

Regards,
Martin Maechler, ETH Zurich

    AH> -- 
    AH> Arild Husby
    AH> Institute of Evolutionary Biology
    AH> Room 413, Ashworth Labs,
    AH> King's Buildings,
    AH> University of Edinburgh
    AH> EH9 3JT, UK

    AH> E-mail: arild.husby at ed.ac.uk
    AH> web: http://homepages.ed.ac.uk/loeske/arild.html
    AH> Tel: +44 (0)131 650 5990
    AH> Mob: +44 (0)798 275 0668

    AH> _______________________________________________
    AH> R-sig-mixed-models at r-project.org mailing list
    AH> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

------------------------------

Message: 6
Date: Fri, 19 Oct 2007 01:44:15 -0700
From: Nathaniel Smith <njs at pobox.com>
Subject: [R-sig-ME] lmer() producing wildly inaccurate values for some
    GLM    models
To: R-SIG-Mixed-Models <r-sig-mixed-models at r-project.org>
Message-ID: <20071019084415.GA18600 at frances.vorpus.org>
Content-Type: text/plain; charset=us-ascii

Hello,

The data I am trying to model are positive and right-skewed (they're
measurements of fixation times recorded from an eye-tracker), so I've
been trying to use lmer's GLM fitting capability to model them.  In
particular, I've been investigating whether it makes most sense to
use something like a gamma or inverse gaussian distribution with an
identity link (since theory and examination of the data both suggest
that the mean of the response is in fact linear in my predictors of
interest).  I was hoping to use log likelihood to work out which
model made the most sense (e.g., using the approach of Vuong 1989),
but... something seemed funny about the numbers I was getting for the
likelihood, and also the random effects, so I did some testing.

lmer's log-likelihood calculations for these models appear to be
completely wrong.  In particular, the log likelihood that it
calculates seems to be many orders of magnitude smaller than it could
possibly be, and orders of magnitude smaller when fitting an inverse
gaussian distribution than when fitting a gamma distribution, even if
the data in fact are closer to a gamma distribution.

To demonstrate the problem, I generated some toy data where there the
true random effects are null.  This isn't necessary to cause the
problem (i.e., lmer isn't just breaking because the model is
over-specified); I did things this way because it makes the models
fit by lmer and glm more directly comparable.

This code takes 10,000 samples from the true model for a Gamma GLM
with E(y) = 200 + 3*x, identity link, and shape parameter 5, then
attempts to fit it in various ways:

---------------------------->8----------------------------

set.seed(0)
samples <- 10000
true.shape <- 5
x <- rgamma(samples, 2, scale=2)
i <- 200
b <- 3
true.scale <- (i + b*x)/true.shape
y <- rgamma(samples, true.shape, scale=true.scale)
subj <- factor(c(rep("a", samples/2), rep("b", samples/2)))
gaussian.lm <- lm(y ~ x)
gaussian.lmer <- lmer(y ~ x + (x | subj))
gamma.glm <- glm(y ~ x, family=Gamma(link="identity"))
gamma.lmer <- lmer(y ~ x + (x | subj), family=Gamma(link="identity"))
ig.glm <- glm(y ~ x, family=inverse.gaussian(link="identity"))
ig.lmer <- lmer(y ~ x + (x | subj),
                family=inverse.gaussian(link="identity"))
nums.glm <- function(model) c(coef(model), logLik=logLik(model))
nums.lmer <- function(model) c(fixef(model), logLik=logLik(model))
rbind(gaussian.lm=nums.glm(gaussian.lm), gaussian.lmer=nums.lmer(gaussian.lmer),
      gamma.glm=nums.glm(gamma.glm), gamma.lmer=nums.lmer(gamma.lmer),
      ig.glm=nums.glm(ig.glm), ig.lmer=nums.lmer(ig.lmer))

----------------------------8<----------------------------

Output:

---------------------------->8----------------------------

              (Intercept)        x        logLik
gaussian.lm      203.0480 2.686375 -59761.426766
gaussian.lmer    203.0279 2.693471 -59759.572897
gamma.glm        202.8143 2.745581 -59054.739626
gamma.lmer      202.8143 2.745464  -1025.060862
ig.glm          202.7024 2.774986 -59309.711506
ig.lmer          202.7024 2.774986    -5.788107

----------------------------8<----------------------------

So, as you can see:
  -- Every fitting function does well at recovering the correct
    regression coefficients (and furthermore there is excellent
    agreement between (g)lm and lmer -- lmer correctly identifies
    the lack of random effects in all cases).
  -- The (g)lm log-likelihoods are all plausible (also I checked the
    one for gamma by hand, and it is correct), and they correctly
    identify that this is a gamma rather than inverse-gaussian
    distribution
  -- The lmer log-likelihood for the gaussian model seems accurate
  -- The lmer log-likelihoods for the other two models are completely
    bonkers (I *wish* I had a model that achieved a log likelihood of
    -5.8 over 10,000 samples...)

The lmer's reported AIC, BIC, and deviance appear to be similarly
underestimated.

Additionally, the random effects calculated by lmer for these models
seem very fishy, but I'm not as confident in my analysis here,
because I'm not entirely sure my toy data is correctly sampling from
the true distribution that lmer assumes.  Anyway, here's the code,
which is attempting to create a model like that above, but this time
with 10 subjects, each of whose intercept and slope terms are sampled
independently from normal distributions with different variances --
400 and 1, respectively:

---------------------------->8----------------------------

set.seed(0)
subjects <- 10
samples <- 1000
true.shape <- 5
i <- 200
i.var <- 400
b <- 3
b.var <- 1
x <- NULL
subj <- NULL
true.scale <- NULL
for (s in 1:subjects) {
  subj <- c(subj, rep(s, samples))
  x.subj <- rgamma(samples, 2, scale=2)
  x <- c(x, x.subj)
  i.subj <- rnorm(1, i, sqrt(i.var))
  b.subj <- rnorm(1, b, sqrt(b.var))
  true.scale <- c(true.scale, (i.subj + b*x.subj)/true.shape)
}
subj <- factor(subj)
y <- rgamma(subjects * samples, true.shape, scale=true.scale)
gaussian.lm <- lm(y ~ x)
gaussian.lmer <- lmer(y ~ x + (x | subj))
gamma.glm <- glm(y ~ x, family=Gamma(link="identity"))
gamma.lmer <- lmer(y ~ x + (x | subj), family=Gamma(link="identity"))
ig.glm <- glm(y ~ x, family=inverse.gaussian(link="identity"))
ig.lmer <- lmer(y ~ x + (x | subj),
family=inverse.gaussian(link="identity"))
nums.glm <- function(model) c(coef(model), logLik=logLik(model))
nums.lmer <- function(model) c(fixef(model), logLik=logLik(model))
rbind(gaussian.lm=nums.glm(gaussian.lm), gaussian.lmer=nums.lmer(gaussian.lmer),
      gamma.glm=nums.glm(gamma.glm), gamma.lmer=nums.lmer(gamma.lmer),
      ig.glm=nums.glm(ig.glm), ig.lmer=nums.lmer(ig.lmer))
VarCorr(gaussian.lmer)$subj
VarCorr(gamma.lmer)$subj
VarCorr(ig.lmer)$subj

---------------------------->8----------------------------

Output:

---------------------------->8----------------------------

              (Intercept)        x        logLik
gaussian.lm      192.1295 2.636897 -59432.332233
gaussian.lmer    192.0867 2.648113 -59257.574632
gamma.glm        192.3558 2.579673 -58668.208775
gamma.lmer      192.1256 2.578616  -1038.636243
ig.glm          192.4572 2.553089 -58910.126512
ig.lmer          192.4572 2.553093    -6.340702
> VarCorr(gaussian.lmer)$subj
2 x 2 Matrix of class "dpoMatrix"
            (Intercept)          x
(Intercept)  252.21320 11.4298584
x              11.42986  0.5179811
> VarCorr(gamma.lmer)$subj
2 x 2 Matrix of class "dpoMatrix"
            (Intercept)          x
(Intercept)  41.109713 1.63892630
x              1.638926 0.06795769
> VarCorr(ig.lmer)$subj
2 x 2 Matrix of class "dpoMatrix"
              (Intercept)            x
(Intercept)  1.538172e-06 -8.630771e-19
x          -8.630771e-19  5.124452e-13

---------------------------->8----------------------------

The first part just duplicates what I found above, to show that it
still arises with models that are not overspecified.

The second part is the variance-covariance matrices -- in no case are
the estimates anywhere near accurate, and for the inverse gaussian
model it somehow has decided that there are effectively no random
effects at all.  I see the same behavior on my real data -- that is,
random effects for the inverse gaussian model coming out to
effectively zero.  I can't say if the estimates for the gamma model
are also incorrect, because I don't know what the true values are
:-).  But this is not giving me confidence that the values I'm getting
are anywhere near true, either :-(.

This is all with R 2.6.0 on x86-64 linux, with lme4 0.99875-8.

Help!
-- Nathaniel

-- 
So let us espouse a less contested notion of truth and falsehood, even
if it is philosophically debatable (if we listen to philosophers, we
must debate everything, and there would be no end to the discussion).
  -- Serendipities, Umberto Eco

------------------------------

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

End of R-sig-mixed-models Digest, Vol 10, Issue 16
**************************************************

      ___________________________________________________________ 
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  http://uk.promotions.yahoo.com/forgood/environment.html