[R] error using user-defined link function with mixed models (LMER)
Douglas Bates
bates at stat.wisc.edu
Sun Feb 11 14:38:34 CET 2007
Look at the 'link' component of the two lists. In the binomial family
object the link component is a character vector of link 1. In your
logexposure family object it is a list of length 5.
On 2/10/07, Jessi Brown <jessilbrown at gmail.com> wrote:
> Ok, I've tried checking out the structure of the binomial and
> logexposure families, the big difference appears to be the valideta
> parameter (it's "NULL" in the logexposure family).
>
> First, binomial:
> > str(binomial())
> List of 11
> $ family : chr "binomial"
> $ link : chr "logit"
> $ linkfun :function (mu)
> $ linkinv :function (eta)
> $ variance :function (mu)
> $ dev.resids:function (y, mu, wt)
> $ aic :function (y, n, mu, wt, dev)
> $ mu.eta :function (eta)
> $ initialize: expression({ if (NCOL(y) == 1) { if
> (is.factor(y)) y <- y != levels(y)[1] n <-
> rep.int(1, nobs) if (any(y < 0 | y > 1)) stop("y
> values must be 0 <= y <= 1") mustart <- (weights * y +
> 0.5)/(weights + 1) m <- weights * y if (any(abs(m -
> round(m)) > 0.001)) warning("non-integer #successes in a
> binomial glm!") } else if (NCOL(y) == 2) { if
> (any(abs(y - round(y)) > 0.001)) warning("non-integer
> counts in a binomial glm!") n <- y[, 1] + y[, 2] y <-
> ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n
> mustart <- (n * y + 0.5)/(n + 1) } else stop("for the binomial
> family, y must be a vector of 0 and 1's\n", "or a 2 column
> matrix where col 1 is no. successes and col 2 is no. failures") })
> $ validmu :function (mu)
> $ valideta :function (eta)
> - attr(*, "class")= chr "family"
>
>
> Now, logexposure:
> > str(logexposure(link="logexp", ExposureDays=apfa4$days))
> List of 11
> $ family : chr "binomial"
> $ link :List of 5
> ..$ linkfun :function (mu)
> .. ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
> ..$ linkinv :function (eta)
> .. ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
> ..$ mu.eta :function (eta)
> .. ..- attr(*, "source")= chr [1:3] "function(eta)" ...
> ..$ valideta:function (eta)
> .. ..- attr(*, "source")= chr "function(eta) TRUE"
> ..$ name : chr "logexp()"
> ..- attr(*, "class")= chr "link-glm"
> $ linkfun :function (mu)
> ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
> $ linkinv :function (eta)
> ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
> $ variance :function (mu)
> ..- attr(*, "source")= chr "function(mu) mu * (1 - mu)"
> $ dev.resids:function (y, mu, wt)
> ..- attr(*, "source")= chr [1:2] "function(y, mu, wt)
> .Call(\"binomial_dev_resids\"," ...
> $ aic :function (y, n, mu, wt, dev)
> ..- attr(*, "source")= chr [1:7] "function(y, n, mu, wt, dev) {" ...
> $ mu.eta :function (eta)
> ..- attr(*, "source")= chr [1:3] "function(eta)" ...
> $ initialize: expression({ if (NCOL(y) == 1) { if
> (is.factor(y)) y <- y != levels(y)[1] n <-
> rep.int(1, nobs) if (any(y < 0 | y > 1)) stop("y
> values must be 0 <= y <= 1") mustart <- (weights * y +
> 0.5)/(weights + 1) m <- weights * y if (any(abs(m -
> round(m)) > 0.001)) warning("non-integer successes in a
> binomial glm!") } else if (NCOL(y) == 2) { if
> (any(abs(y - round(y)) > 0.001)) warning("non-integer
> counts in a binomial glm!") n <- y[, 1] + y[, 2] y <-
> ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n
> mustart <- (n * y + 0.5)/(n + 1) } else stop("for the binomial
> family,", " y must be a vector of 0 and 1's\n", "or a 2
> column matrix where col 1 is", " no. successes and col 2 is no.
> failures") })
> $ validmu :function (mu)
> ..- attr(*, "source")= chr "function(mu) all(mu > 0) && all(mu < 1)"
> $ valideta : NULL
> - attr(*, "class")= chr "family"
>
>
> So, could this be the root of the problem?
>
> Here again is the logexp function:
> logexp <- function(days = 1)
> {
> linkfun <- function(mu) qlogis(mu^(1/days))
> linkinv <- function(eta) plogis(eta)^days
> mu.eta <- function(eta)
> days*.Call("logit_mu_eta", eta,
> PACKAGE = "stats")*plogis(eta)^(days-1)
> valideta <- function(eta) TRUE
> link <- paste("logexp(", days, ")", sep="")
> structure(list(linkfun = linkfun, linkinv = linkinv,
> mu.eta = mu.eta, valideta = valideta, name = link),
> class = "link-glm")
> }
>
> So, does something seem obviously wrong about the "valideta <-
> function(eta) TRUE" bit? I readily admit that I did not write these
> user-defined link and family functions (I believe they resulted from
> the combined efforts of Mark Herzog and Brian Ripley), so my clumsy
> efforts to toy with the valideta portion of the logexp link function
> aren't working.
>
> Thanks again in advance for any further advice.
>
> cheers, Jessi Brown
>
> On 2/10/07, Douglas Bates <bates at stat.wisc.edu> wrote:
> > On 2/10/07, Jessi Brown <jessilbrown at gmail.com> wrote:
> > > Greetings, everyone. I've been trying to analyze bird nest survival
> > > data using generalized linear mixed models (because we documented
> > > several consecutive nesting attempts by the same individuals; i.e.
> > > repeated measures data) and have been unable to persuade the various
> > > GLMM models to work with my user-defined link function. Actually,
> > > glmmPQL seems to work, but as I want to evaluate a suite of competing
> > > models, I'd like to use ML or REML estimation methods in order to end
> > > up with meaningful log-likelihoods.
> > >
> > > Here's the link function I use:
> > > logexp <- function(days = 1)
> > > {
> > > linkfun <- function(mu) qlogis(mu^(1/days))
> > > linkinv <- function(eta) plogis(eta)^days
> > > mu.eta <- function(eta)
> > > days*.Call("logit_mu_eta", eta,
> > > PACKAGE = "stats")*plogis(eta)^(days-1)
> > > valideta <- function(eta) TRUE
> > > link <- paste("logexp(", days, ")", sep="")
> > > structure(list(linkfun = linkfun, linkinv = linkinv,
> > > mu.eta = mu.eta, valideta = valideta, name = link),
> > > class = "link-glm")
> > > }
> > >
> > > # Modified binomial family function (that allows logexp link function)
> > > logexposure<-function (link="logexp",ExposureDays) {
> > > variance <- function(mu) mu * (1 - mu)
> > > validmu <- function(mu) all(mu > 0) && all(mu < 1)
> > > dev.resids <- function(y, mu, wt) .Call("binomial_dev_resids",
> > > y, mu, wt, PACKAGE = "stats")
> > > aic <- function(y, n, mu, wt, dev) {
> > > m <- if (any(n > 1))
> > > n
> > > else wt
> > > -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m *
> > > y), round(m), mu, log = TRUE))
> > > }
> > > initialize <- expression({
> > > if (NCOL(y) == 1) {
> > > if (is.factor(y)) y <- y != levels(y)[1]
> > > n <- rep.int(1, nobs)
> > > if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1")
> > > mustart <- (weights * y + 0.5)/(weights + 1)
> > > m <- weights * y
> > > if (any(abs(m - round(m)) > 0.001))
> > > warning("non-integer successes in a binomial glm!")
> > > } else if (NCOL(y) == 2) {
> > > if (any(abs(y - round(y)) > 0.001))
> > > warning("non-integer counts in a binomial glm!")
> > > n <- y[, 1] + y[, 2]
> > > y <- ifelse(n == 0, 0, y[, 1]/n)
> > > weights <- weights * n
> > > mustart <- (n * y + 0.5)/(n + 1)
> > > } else stop("for the binomial family,",
> > > " y must be a vector of 0 and 1's\n",
> > > "or a 2 column matrix where col 1 is",
> > > " no. successes and col 2 is no. failures")
> > > })
> > > structure(list(family="binomial", link=logexp(ExposureDays),
> > > linkfun=logexp(ExposureDays)$linkfun,
> > > linkinv=logexp(ExposureDays)$linkinv, variance=variance,
> > > dev.resids=dev.resids, aic=aic,
> > > mu.eta=logexp(ExposureDays)$mu.eta, initialize=initialize,
> > > validmu=validmu, valideta=logexp$valideta), class = "family")
> > > }
> > >
> > >
> > >
> > > Now, here's how it works in a GLM:
> > >
> > > > apfa.glm.1<-glm(Success~MeanAge+I(MeanAge^2), family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > > > summary(apfa.glm.1)
> > >
> > > Call:
> > > glm(formula = Success ~ MeanAge + I(MeanAge^2), family =
> > > logexposure(link = "logexp",
> > > ExposureDays = apfa4$Days), data = apfa4)
> > >
> > > Deviance Residuals:
> > > Min 1Q Median 3Q Max
> > > -3.1525 0.2802 0.3637 0.4291 0.7599
> > >
> > > Coefficients:
> > > Estimate Std. Error z value Pr(>|z|)
> > > (Intercept) 5.5594830 0.6085542 9.136 <2e-16 ***
> > > MeanAge -0.0908251 0.0407218 -2.230 0.0257 *
> > > I(MeanAge^2) 0.0014926 0.0006104 2.445 0.0145 *
> > > ---
> > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > >
> > > (Dispersion parameter for binomial family taken to be 1)
> > >
> > > Null deviance: 323.58 on 661 degrees of freedom
> > > Residual deviance: 285.65 on 659 degrees of freedom
> > > AIC: 291.65
> > >
> > > Number of Fisher Scoring iterations: 6
> > >
> > >
> > >
> > > Next, here's the results of a glmmPQL run:
> > >
> > > > apfa.glmm.1<-glmmPQL(Success~MeanAge+I(MeanAge^2), random=~1|Territory, family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > > iteration 1
> > > > summary(apfa.glmm.1)
> > > Linear mixed-effects model fit by maximum likelihood
> > > Data: apfa4
> > > AIC BIC logLik
> > > NA NA NA
> > >
> > > Random effects:
> > > Formula: ~1 | Territory
> > > (Intercept) Residual
> > > StdDev: 0.0003431913 1.051947
> > >
> > > Variance function:
> > > Structure: fixed weights
> > > Formula: ~invwt
> > > Fixed effects: Success ~ MeanAge + I(MeanAge^2)
> > > Value Std.Error DF t-value p-value
> > > (Intercept) 5.559466 0.6416221 624 8.664705 0.0000
> > > MeanAge -0.090824 0.0429346 624 -2.115397 0.0348
> > > I(MeanAge^2) 0.001493 0.0006436 624 2.319090 0.0207
> > > Correlation:
> > > (Intr) MeanAg
> > > MeanAge -0.927
> > > I(MeanAge^2) 0.826 -0.968
> > >
> > > Standardized Within-Group Residuals:
> > > Min Q1 Med Q3 Max
> > > -11.3646020 0.1901969 0.2485473 0.2951632 0.5499915
> > >
> > > Number of Observations: 662
> > > Number of Groups: 36
> > >
> > >
> > >
> > > Finally, here's what happens when I try to run an LMER model (same
> > > error messages no matter which estimation method I choose):
> > >
> > > > apfa.lmer.1<-lmer(Success~MeanAge+I(MeanAge^2)+(1|Territory), data=apfa4, family=logexposure(link="logexp", ExposureDays=apfa4$Days), method="Laplace")
> > > > summary(apfa.lmer.1)
> > > Error in if (any(sd < 0)) return("'sd' slot has negative entries") :
> > > missing value where TRUE/FALSE needed
> > > > names(apfa.lmer.1)
> > > NULL
> >
> > > So, does anyone have any idea as to whether the problem is in the
> > > user-defined link function as written, or have any thoughts about how
> > > to get around this problem? If LMER and LME can't do it, could there
> > > be some way to trick the glmmML function into accepting the
> > > user-defined link function?
> >
> > lmer is designed to work with arbitrary families but I haven't done a
> > lot of testing outside the binomial and poisson families.
> >
> > Try looking at the structure of an instance of your family and
> > comparing it to, say,
> >
> > str(binomial())
> >
> > Make sure that all the components are there and have the correct form.
> >
> > The next step is to use verbose output so you can see the progress of
> > the iterations. Add the optional argument control = list(msVerbose =
> > 1) to your call to lmer.
> >
> > Instead of immediately requesting a summary, use str(apfa.lmer.1) to
> > check the structure. Again you may want to compare this description
> > to that from str applied to a similar fit using the binomial family.
> >
>
More information about the R-help
mailing list