[R] error using user-defined link function with mixed models (LMER)

Jessi Brown jessilbrown at gmail.com
Sat Feb 10 20:45:39 CET 2007


Greetings, everyone. I've been trying to analyze bird nest survival
data using generalized linear mixed models (because we documented
several consecutive nesting attempts by the same individuals; i.e.
repeated measures data) and have been unable to persuade the various
GLMM models to work with my user-defined link function. Actually,
glmmPQL seems to work, but as I want to evaluate a suite of competing
models, I'd like to use ML or REML estimation methods in order to end
up with meaningful log-likelihoods.

Here's the link function I use:
logexp <- function(days = 1)
{
    linkfun <- function(mu) qlogis(mu^(1/days))
    linkinv <- function(eta) plogis(eta)^days
    mu.eta <- function(eta)
      days*.Call("logit_mu_eta", eta,
       PACKAGE = "stats")*plogis(eta)^(days-1)
    valideta <- function(eta) TRUE
    link <- paste("logexp(", days, ")", sep="")
    structure(list(linkfun = linkfun, linkinv = linkinv,
       mu.eta = mu.eta, valideta = valideta, name = link),
        class = "link-glm")
}

# Modified binomial family function (that allows logexp link function)
logexposure<-function (link="logexp",ExposureDays) {
    variance <- function(mu) mu * (1 - mu)
    validmu <- function(mu) all(mu > 0) && all(mu < 1)
    dev.resids <- function(y, mu, wt) .Call("binomial_dev_resids",
        y, mu, wt, PACKAGE = "stats")
    aic <- function(y, n, mu, wt, dev) {
        m <- if (any(n > 1))
            n
        else wt
        -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m *
            y), round(m), mu, log = TRUE))
    }
    initialize <- expression({
        if (NCOL(y) == 1) {
            if (is.factor(y)) y <- y != levels(y)[1]
            n <- rep.int(1, nobs)
            if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1")
            mustart <- (weights * y + 0.5)/(weights + 1)
            m <- weights * y
            if (any(abs(m - round(m)) > 0.001))
                 warning("non-integer successes in a binomial glm!")
        } else if (NCOL(y) == 2) {
            if (any(abs(y - round(y)) > 0.001))
                warning("non-integer counts in a binomial glm!")
            n <- y[, 1] + y[, 2]
            y <- ifelse(n == 0, 0, y[, 1]/n)
            weights <- weights * n
            mustart <- (n * y + 0.5)/(n + 1)
        } else stop("for the binomial family,",
                " y must be a vector of 0 and 1's\n",
            "or a 2 column matrix where col 1 is",
            " no. successes and col 2 is no. failures")
    })
    structure(list(family="binomial", link=logexp(ExposureDays),
       linkfun=logexp(ExposureDays)$linkfun,
       linkinv=logexp(ExposureDays)$linkinv, variance=variance,
       dev.resids=dev.resids, aic=aic,
       mu.eta=logexp(ExposureDays)$mu.eta, initialize=initialize,
       validmu=validmu, valideta=logexp$valideta), class = "family")
}



Now, here's how it works in a GLM:

> apfa.glm.1<-glm(Success~MeanAge+I(MeanAge^2), family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> summary(apfa.glm.1)

Call:
glm(formula = Success ~ MeanAge + I(MeanAge^2), family =
logexposure(link = "logexp",
    ExposureDays = apfa4$Days), data = apfa4)

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-3.1525   0.2802   0.3637   0.4291   0.7599

Coefficients:
               Estimate Std. Error z value Pr(>|z|)
(Intercept)   5.5594830  0.6085542   9.136   <2e-16 ***
MeanAge      -0.0908251  0.0407218  -2.230   0.0257 *
I(MeanAge^2)  0.0014926  0.0006104   2.445   0.0145 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 323.58  on 661  degrees of freedom
Residual deviance: 285.65  on 659  degrees of freedom
AIC: 291.65

Number of Fisher Scoring iterations: 6



Next, here's the results of a glmmPQL run:

> apfa.glmm.1<-glmmPQL(Success~MeanAge+I(MeanAge^2), random=~1|Territory, family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
iteration 1
> summary(apfa.glmm.1)
Linear mixed-effects model fit by maximum likelihood
 Data: apfa4
  AIC BIC logLik
   NA  NA     NA

Random effects:
 Formula: ~1 | Territory
         (Intercept) Residual
StdDev: 0.0003431913 1.051947

Variance function:
 Structure: fixed weights
 Formula: ~invwt
Fixed effects: Success ~ MeanAge + I(MeanAge^2)
                 Value Std.Error  DF   t-value p-value
(Intercept)   5.559466 0.6416221 624  8.664705  0.0000
MeanAge      -0.090824 0.0429346 624 -2.115397  0.0348
I(MeanAge^2)  0.001493 0.0006436 624  2.319090  0.0207
 Correlation:
             (Intr) MeanAg
MeanAge      -0.927
I(MeanAge^2)  0.826 -0.968

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-11.3646020   0.1901969   0.2485473   0.2951632   0.5499915

Number of Observations: 662
Number of Groups: 36



Finally, here's what happens when I try to run an LMER model (same
error messages no matter which estimation method I choose):

> apfa.lmer.1<-lmer(Success~MeanAge+I(MeanAge^2)+(1|Territory), data=apfa4, family=logexposure(link="logexp", ExposureDays=apfa4$Days), method="Laplace")
> summary(apfa.lmer.1)
Error in if (any(sd < 0)) return("'sd' slot has negative entries") :
	missing value where TRUE/FALSE needed
> names(apfa.lmer.1)
NULL


So, does anyone have any idea as to whether the problem is in the
user-defined link function as written, or have any thoughts about how
to get around this problem? If LMER and LME can't do it, could there
be some way to trick the glmmML function into accepting the
user-defined link function?

Thank you in advance for any help or advice.

cheers, Jessi Brown



More information about the R-help mailing list