[R-sig-ME] fixed effects estimates too small in binomial GLMM with low "success"-rate (

Mon May 19 15:37:42 CEST 2014

  Just to give one more comment (in lieu of actually doing the work): if
my conjecture is correct (that it's a Jensen's inequality issue), then
the comparison of means on the linear predictor scale would be correct.
 The problem is that with a lot of GLMM problems you can't sensibly look
at the raw data on the linear predictor scale -- e.g. you can't
logit-transform binary data and get anything sensible.  You could try to
work out the expected effect of nonlinear averaging/Jensen's inequality
analytically or numerically and see if it matched up with the observed
discrepancy ...

On 14-05-19 06:22 AM, Solis wrote:
> Dear Ben and others on the list,
> 
> The 'possibly neglected' mail exchange is probably this one; 
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q1/021919.html
> 
> I hope the link helps ( I'm about to board a plane) but I'm excited
> to see my question is picked up and might have an answer.
> 
> Kind regards,
> 
> Tom
> 
>> Op 18 mei 2014 om 12:00 heeft
>> <r-sig-mixed-models-request at r-project.org> het volgende
>> geschreven:
>> 
>> Send R-sig-mixed-models mailing list submissions to 
>> r-sig-mixed-models at r-project.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models or, via
>> email, send a message with subject or body 'help' to 
>> r-sig-mixed-models-request at r-project.org
>> 
>> You can reach the person managing the list at 
>> r-sig-mixed-models-owner at r-project.org
>> 
>> When replying, please edit your Subject line so it is more
>> specific than "Re: Contents of R-sig-mixed-models digest..."
>> 
>> 
>> Today's Topics:
>> 
>> 1. fixed effects estimates too small in binomial GLMM with    low 
>> "success"-rate (Henrik 2. Re: fixed effects estimates too small in
>> binomial GLMM with low "success"-rate (Ben Bolker) 3. Re: fixed
>> effects estimates too small in binomial GLMM    with low
>> "success"-rate (Henrik Singmann) 4. "parm" parameter in
>> lme4::confint.merMod (Jake Westfall)
>> 
>> 
>> ----------------------------------------------------------------------
>>
>>
>> 
Message: 1
>> Date: Sun, 18 May 2014 00:07:13 +0200 From: Henrik Singmann
>> <henrik.singmann at psychologie.uni-freiburg.de> To:
>> r-sig-mixed-models at r-project.org Subject: [R-sig-ME] fixed effects
>> estimates too small in binomial GLMM with    low "success"-rate 
>> Message-ID: <5377DD91.9030705 at psychologie.uni-freiburg.de> 
>> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>> 
>> Dear list,
>> 
>> I would really appreciate your input on an issue in which the fixed
>> effects estimates of a binomial GLMM with relatively low rate of
>> "successes" are too low (compared to the observed effect) by around
>> .5%.
>> 
>> The data comes from six comparable experiments in which we measured
>> participants' performance errors on a series of trials in a design
>> with one factor with two levels ("a" and "b"). Overall the
>> difference in error rate between the two levels was very small
>> (~0.5%), but when analyzing all data (with participant and
>> experiment as random effects) the effect of factor reached
>> significance (p = .03), which is also consistent with other
>> published data.
>> 
>> However, the estimated effects are around .5% lower than the
>> observed mean error rates indicating that the model may perhaps not
>> adequately describe the data. Furthermore, I also get a convergence
>> warning ("Model failed to converge with max|grad| = 0.0013716 (tol
>> = 0.001)").
>> 
>> ####### code 1 ########## require(lme4) dat <-
>> read.table("http://pastebin.com/raw.php?i=KWjD5EG3") dat$experiment
>> <- factor(dat$experiment)
>> 
>> # observed effects aggregate(errors ~ factor, dat, mean)
>> #unweighted ##   factor     errors ## 1      a 0.02266914 ## 2
>> b 0.02772540 by(dat, dat$factor, function(x) with(x,
>> sum(errors*weights)/sum(weights))) #weighted ## dat$factor: a ##
>> [1] 0.0222343 ##
>> ------------------------------------------------------------ ##
>> dat$factor: b ## [1] 0.02777651
>> 
>> m1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), dat,
>> weights = dat$weights, family = binomial) ## Warnmeldung: ## In
>> checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> : ##   Model failed to converge with max|grad| = 0.0013716 (tol =
>> 0.001) coef(summary(m1))[2,, drop = FALSE] ##          Estimate
>> Std. Error  z value   Pr(>|z|) ## factorb 0.1788881 0.08005533
>> 2.234556 0.02544653
>> 
>> # estimated effect: logistic <- function(x) 1/(1+exp(-x)) 
>> logistic(cumsum(fixef(m1))) ## (Intercept)     factorb ##
>> 0.01825229  0.02174991
>> 
>> ####### end code 1 ##########
>> 
>> Interestingly, I can replicate this behavior when simulating
>> binomial data similar to the one I am having. When the overall mean
>> is relatively small, the estimates from the GLMM are consistently
>> too low. When increasing the overall mean, this consistent bias in
>> the estimated effect seems to go away (and the observed effect gets
>> larger due to the non-linear nature of the logistic model). The
>> code of the simulation for one data set follows below.
>> 
>> My question is: Does anyone have an idea what I could do to reach a
>> better agreement between observed and predicted values? I tried
>> different link functions, but that didn't help. And I am unsure
>> whether I can trust my model given this discrepancy.
>> 
>> Thanks a lot, Henrik
>> 
>> 
>> ##### code 2 #####
>> 
>> mu <- -3.7  # when larger, problem disappears n_experiments <- 6 
>> n_participants <- 20 n_trials <- 400 sigma_effect <- 0.2 
>> effect_size <- 0.4 sigma_p <- 0.7 sigma_e <- 0.1 prob_level <- 0.5
>> 
>> # create data: df <- expand.grid(experiment = factor(seq(1,
>> n_experiments)), id = factor(seq(1, n_participants)), factor =
>> c(-1, 1)) df$id <- with(df, experiment:id) df <- merge(df,
>> data.frame(experiment = seq(1, n_experiments), re_e =
>> rnorm(n_experiments, sd = sigma_e))) df <- merge(df, data.frame(id
>> = levels(df$id), re_p = rnorm(length(levels(df$id)), sd =
>> sigma_p))) df <- merge(df, data.frame(id = levels(df$id), re_a =
>> rnorm(length(levels(df$id)), sd = sigma_effect))) df <- with(df,
>> df[order(id, factor),]) tmp <- rbinom(n_experiments*n_participants,
>> n_trials, prob_level) tmp <- rep(tmp, each = 2) tmp[c(FALSE, TRUE)]
>> <- n_trials-tmp[c(FALSE, TRUE)] df$errors <- mapply(function(x, y)
>> rbinom(1, y, x), with(df,
>> logistic(mu+re_p+(factor*(effect_size/2)+factor*re_a)+re_e)),
>> tmp)/tmp df$weights <- tmp df$factor <- factor(df$factor)
>> 
>> aggregate(errors ~ factor, df, mean) #unweighted ##   factor
>> errors ## 1     -1 0.02352719 ## 2      1 0.03718523
>> 
>> s1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), df,
>> weights = df$weights, family = binomial) ## Warnmeldung: ## In
>> checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> : ##   Model failed to converge with max|grad| = 0.0130648 (tol =
>> 0.001) coef(summary(s1))[2,, drop = FALSE] #          Estimate Std.
>> Error  z value     Pr(>|z|) # factor1 0.4765694 0.07578773 6.288213
>> 3.211416e-10
>> 
>> logistic(cumsum(fixef(s1))) # (Intercept)     factor1 #  0.01849934
>> 0.02946117
>> 
>> #### end code 2 ####
>> 
>> 
>> -- Dr. Henrik Singmann Albert-Ludwigs-Universit?t Freiburg,
>> Germany http://www.psychologie.uni-freiburg.de/Members/singmann
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 2 Date: Sat, 17 May 2014 18:19:05 -0400 From: Ben Bolker
>> <bbolker at gmail.com> To: Henrik Singmann
>> <henrik.singmann at psychologie.uni-freiburg.de>, 
>> r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] fixed
>> effects estimates too small in binomial GLMM with low
>> "success"-rate Message-ID: <5377E059.2050800 at gmail.com> 
>> Content-Type: text/plain; charset=ISO-8859-15
>> 
>>> On 14-05-17 06:07 PM, Henrik Singmann wrote: Dear list,
>>> 
>>> I would really appreciate your input on an issue in which the
>>> fixed effects estimates of a binomial GLMM with relatively low
>>> rate of "successes" are too low (compared to the observed effect)
>>> by around .5%.
>> 
>> Don't know, but one comment and one thought:
>> 
>> (1) the convergence warning goes away with the most recent version 
>> of lme4 (1.1-7)
>> 
>> (2) I'm _guessing_ that you will find that this phenomenon is due
>> to Jensen's effect (i.e., a nonlinear averaging effect) -- that
>> would mean its magnitude should be proportional to the 
>> random-effects variances and to the strength of the nonlinear 
>> (inverse link) relationship. (I think there was a similar, possibly
>> neglected, question along these lines on the mailing list earlier
>> ...)
>> 
>> Ben Bolker
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 3 Date: Sun, 18 May 2014 00:38:20 +0200 From: Henrik
>> Singmann <henrik.singmann at psychologie.uni-freiburg.de> To:
>> r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] fixed
>> effects estimates too small in binomial GLMM    with low
>> "success"-rate Message-ID:
>> <5377E4DC.6080809 at psychologie.uni-freiburg.de> Content-Type:
>> text/plain; charset=ISO-8859-1; format=flowed
>> 
>> 
>> 
>> Am 18.05.2014 00:19, schrieb Ben Bolker:
>>> On 14-05-17 06:07 PM, Henrik Singmann wrote:
>>>> Dear list,
>>>> 
>>>> I would really appreciate your input on an issue in which the
>>>> fixed effects estimates of a binomial GLMM with relatively low
>>>> rate of "successes" are too low (compared to the observed
>>>> effect) by around .5%.
>>> 
>>> Don't know, but one comment and one thought:
>>> 
>>> (1) the convergence warning goes away with the most recent
>>> version of lme4 (1.1-7)
>> 
>> This is true, thanks.
>>> 
>>> (2) I'm _guessing_ that you will find that this phenomenon is due
>>> to Jensen's effect (i.e., a nonlinear averaging effect) -- that
>>> would mean its magnitude should be proportional to the 
>>> random-effects variances and to the strength of the nonlinear 
>>> (inverse link) relationship. (I think there was a similar,
>>> possibly neglected, question along these lines on the mailing
>>> list earlier ...)
>> 
>> Thanks for the pointer. In fact, when entering experiment as fixed
>> effect (which removes the random effect with the smallest
>> variance), the estimates are way more precise. They are almost
>> perfect. Perhaps I should then use this approach (although I would
>> prefer treating experiment as random). Or what would you do?
>> 
>> 
>>> 
>>> Ben Bolker
>> 
>> -- Dr. Henrik Singmann Albert-Ludwigs-Universit?t Freiburg,
>> Germany http://www.psychologie.uni-freiburg.de/Members/singmann
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 4 Date: Sun, 18 May 2014 00:03:52 -0600 From: Jake
>> Westfall <jake987722 at hotmail.com> To:
>> "r-sig-mixed-models at r-project.org" 
>> <r-sig-mixed-models at r-project.org> Subject: [R-sig-ME] "parm"
>> parameter in lme4::confint.merMod Message-ID:
>> <BAY172-W2375062D4CFCE0B6E5F9CBCB330 at phx.gbl> Content-Type:
>> text/plain
>> 
>> Hi all,
>> 
>> According to the documentation, in confint.merMod() you can use the
>> "parm" parameter to get confidence intervals for only a specified
>> subset of parameters by indicating the integer positions of the
>> desired parameters. But how is one supposed to know which integer
>> positions correspond with which parameters? I poked around a little
>> but found nothing in the methods or slots of the merMod object that
>> indicates this, and nothing in the documentation detailing this.
>> Have I missed something? I guess you can call the function leaving
>> that argument at its default, and the order of the parameters in
>> that output will correspond with the integer positions, but this
>> approach would seem to defeat the point of being able to specify
>> particular subsets of parameters...
>> 
>> Jake [[alternative HTML version deleted]]
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________ R-sig-mixed-models
>> mailing list R-sig-mixed-models at r-project.org 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>> 
>> End of R-sig-mixed-models Digest, Vol 89, Issue 22 
>> **************************************************
> 
> _______________________________________________ 
> R-sig-mixed-models at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>