[R-sig-ME] Specification of binomial mixed model with custom intercept

Tom Wenseleers Tom.Wenseleers at bio.kuleuven.be
Sat Jul 4 20:15:23 CEST 2015


Hi Dale,
Many thanks for the pointer - yes, it turned out that this was indeed the right syntax, e.g. for a binomial GLM (analogous syntax for a binomial mixed model) :

data=data.frame(treatment=c("INF","CONTR","INF","CONTR","INF","CONTR","INF","CONTR"),day=c(4,4,8,8,12,12,16,16),infected=c(20,11,18,15,19,16,19,19),not_infected=c(0,9,2,5,1,4,1,1))
data$propinfected=data$infected/(data$infected+data$not_infected)
data$baseline=qlogis(c(0.01,0.99))[data$treatment] # to have starting vals near 0 or 1 for the two groups (CONTROL and experimentally INFected)
fit=glm(cbind(infected,not_infected)~ -1+treatment:log(day),offset=baseline,family=binomial,data=data)

(used log(day) here because that gave a better fit)

Basically, it's a binomial GLM to describe the evolution of the proportion of infected individuals as a function of time (day) in two groups, a CONTROL group, where the initial prop of infecteds is known to be near 0 at day=0, and an experimentally infected group INF, where the initial prop of infecteds is known to be near 1 at day=0.

Turned out the problem had more to do with getting the effects package to plot the model correctly.
Apparently the effects package only supports specifying one fixed intercept/offset - in my case I got around this by plotting the predictions for the two groups separately and each time suppressing the other group's output, and specifying the correct offset for each group : 

library(effects)

a=plot(effect(fit,term="treatment:log(day)",xlevels=list(day=seq(0.01, 17, 1)),offset=qlogis(0.01) ),ylim=c(-0.05,1.05),xlim=c(-1,17),multiline=T,xlab="Day",ylab="Proportion infected",ci.style="bands",rescale.axis=F,rug=F,lines=c(1,1),lwd=c(2,2),colors=c("black",rgb(1,1,1,0)))

b=plot(effect(fit,term="treatment:log(day)",xlevels=list(day=seq(0.01, 17, 1)),offset=qlogis(0.99) ),ylim=c(-0.05,1.05),xlim=c(-1,17),multiline=T,xlab="Day",ylab="Proportion infected",ci.style="bands",rescale.axis=F,rug=F,lines=c(1,1),lwd=c(2,2),colors=c(rgb(1,1,1,0),"red"))

library(latticeExtra)
c=xyplot(propinfected~day,data=data,pch=15,cex=2,col=data$treatment,ylim=c(0,1),xlim=c(-1,17))
a+as.layer(b)+as.layer(c)

Of course I could also use predict() directly, or work with the Effect() dataframes to plot the model, rather than apply this hack...

cheers,
Tom

________________________________________
From: R-sig-mixed-models [r-sig-mixed-models-bounces at r-project.org] on behalf of Dale Barr [dale.barr at glasgow.ac.uk]
Sent: 24 June 2015 18:31
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Specification of binomial mixed model with custom intercept

[ ...and of course also including the offset(baseline) term in the
formula... ]

On 24/06/15 17:27, Dale Barr wrote:
> Hi Tom,
>
> You might want to try the original syntax again, but without estimation
> of the (known) intercept term, using the "-1" syntax:
>
> fit=glmer(cbind(infected,not_infected) ~ -1 + (1|colony) + treatment *
> time, family=binomial, data=data)
>
> -Dale
>
> On 23/06/15 22:23, Tom Wenseleers wrote:
>> Hi Jake,
>> Many thanks for your advice! And yes realised the model would never quite get to 0% or 100% - that's why I had been trying putting in an intercept, as in
>> data$baseline=qlogis(c(0.001,0.999))[data$treatment]
>>
>> But just tried adding points for t=0 and that does indeed seem to give sensible results - so I'll go with that then - thanks for the advice!
>>
>> If anyone else on this list would know how to formally put in constraints like the ones I mentioned, please let me know though!
>>
>> cheers,
>> Tom
>>
>> ________________________________
>> From: Jake Westfall [jake987722 at hotmail.com]
>> Sent: 23 June 2015 23:01
>> To: Tom Wenseleers
>> Subject: RE: [R-sig-ME] Specification of binomial mixed model with custom intercept
>>
>> I see. This does seem more sensible. One complication I should point out is that you will never get your model to predict exactly 100% or 0%, as these correspond to logits of infinity or -infinity, respectively. You could set them to something high like logit = +/- 10 (corresonding to p = .99995 or .00005), but the exact values you fix them to are arbitrary and will affect the other model estimates. So it's tricky. One sort of klugey solution could be to put the time=0 measurements in the dataset "as if" you had recorded them -- with the justification being that you are virtually certain what the measurements would have been had you technically taken them at time=0 -- and then run the unconstrained model. This would basically just be a not-completely-arbitrary way of deciding what non-infinite values to fix the time=0 predictions to.
>>
>>
>> Jake
>>
>>> From: Tom.Wenseleers at bio.kuleuven.be
>>> To: jake987722 at hotmail.com; r-sig-mixed-models at r-project.org
>>> Subject: RE: [R-sig-ME] Specification of binomial mixed model with custom intercept
>>> Date: Tue, 23 Jun 2015 19:35:20 +0000
>>>
>>> Hi Jake,
>>> Well to clarify a bit - I have actual datapoints for time=4, 8, 12 and 16, but not for t=0 days.
>>> For t=0, however, I know that based on my treatments (injecting individuals with virus lysate or with buffer) the proportion of infected individuals was ca 0% for the CONTROL treatment and 100% for the INJECTED group.
>>> Problem is that if this a priori constraint is not taken into account and I fit my model and make an effect plot, the prediction is not exactly 0% for the CONTROL group or 100% for the INJECTED group, even though I know that it should. So my question is whether constraints such as these can be taken into account into either binomial GLMs or binomial mixed models, e.g. by specifying custom offsets/intercepts? (I also have other similar models where I would like to be able to specify that at time=0 the initial proportion is known a priori to be 0.5)
>>>
>>> In general my aim of specifying constraints such as these would be to obtain better fits that better/more parsimoniously reflect known facts about the actual experiments.
>>>
>>> cheers,
>>> Tom
>>>
>>> ________________________________________
>>> From: R-sig-mixed-models [r-sig-mixed-models-bounces at r-project.org] on behalf of Jake Westfall [jake987722 at hotmail.com]
>>> Sent: 23 June 2015 17:30
>>> To: r-sig-mixed-models at r-project.org
>>> Subject: Re: [R-sig-ME] Specification of binomial mixed model with custom intercept
>>>
>>> Hi Tom,
>>>
>>> I'm not sure if this is a sensible thing to do. If your presumption about the proportion of infected insects in each group at time=0 is correct, then surely your data must already reflect this fact? In which case I don't see why you can't just estimate the unconstrained model that you wrote and let the model figure out for itself what p(infected) is at time=0. In short, I don't see the added value of the constraints you mention.
>>>
>>> With that said, it occurs to me that if you really do want to implement the two constraints that you mentioned, then really you are not estimating any fixed-effect parameters at time=0. So it seems you could just as well exclude the time=0 data and just model the treatment factor at time=1. >From those parameter estimates it would be easier to figure out what the time slopes are for each group, since they will just be the difference between the time=1 parameter estimates and whatever values you fixed the proportions at time=0 to. Hope this makes sense.
>>>
>>> Jake
>>>
>>>> From: Tom.Wenseleers at bio.kuleuven.be
>>>> To: r-sig-mixed-models at r-project.org
>>>> Date: Tue, 23 Jun 2015 15:02:47 +0000
>>>> Subject: [R-sig-ME] Specification of binomial mixed model with custom intercept
>>>>
>>>> Dear all,
>>>> I have a binomial mixed model
>>>> fit=glmer(cbind(infected,not_infected)~(1|colony)+treatment*time,family=binomial,data=data)
>>>> in which I am modelling the evolution of an infection in different social insect colonies across two treatment groups (INJECTED and CONTROL) as a function of time.
>>>> However, as my INJECTED group individuals should all be infected at time=0, whereas none of my CONTROL individuals should be infected at time=0, I would like to force the model to go approx through 1 at time t=0 for the INJECTED group and to go approx through 0 at time t=1 for the CONTROL group. What would be the correct way to specify such a model?
>>>> I tried with
>>>> data$baseline=qlogis(c(0.001,0.999))[data$treatment]
>>>> fit=glmer(cbind(infected,not_infected)~(1|colony)+treatment*time+offset(baseline),family=binomial,data=data)
>>>> but this doesn't seem to give sensible predictions.
>>>> Any thoughts on the correct syntax?
>>>>
>>>> cheers,
>>>> Tom Wenseleers
>>>>
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>      [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
Dale Barr
Institute of Neuroscience and Psychology
University of Glasgow
58 Hillhead Street
Glasgow G12 8QB

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list