[R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations

Tom Wenseleers tom.wenseleers at kuleuven.be
Fri Jan 27 10:12:33 CET 2017

Hi Jarrod,
Well the structure of the datafile is such that each row of the datafile contains the proportion of the eggs eaten in each trial by each individual bee (that belongs to a particular patriline, ie there would be several lines per trial corresponding to different individual bees, some of which might also occur again in different trials with the same colony) (this is in fit1 below) or the mean proportion of the eggs eaten in each trial by a patricular patriline (in fit2 below). So I was not entirely sure how I should still incorporate trial as a random effect as well, and how this would look like?
The reason that I thought it would make sense to also include the proportional distribution of the different patrilines in the colony (either as a custom intercept or a covariate, not sure about that) in the model is that these give the a priori probability that eggs would be eaten by bees belonging to different patrilines (as they are heavily skewed), so the heritability of the trait would be expressed mainly in terms of some patrilines making a much greater or much lower contribution to egg eating than expected based on their proportional presence in the colony. But I am not sure how I would correctly do that in my model, hence my question? Makes sense?


From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on behalf of Jarrod Hadfield <j.hadfield at ed.ac.uk>
Sent: 27 January 2017 06:37
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations

Hi Tom,

If I understand your experiment/data set up, each row of the data frame
contains all the data for one trial? If so, having trial as random
effect is one way of modelling any overdispersion in the data with
respect to the binomial. If overdispersion exists it is important to
model this. Other than that the random effect structure seems fine.

Also I don't understand why baseline is fitted, especially as an offset.
If you are fitting patriline as a random effect presumably you know the
patriline for each bee? Why then fit the proportion of the colony that
has the same patriline as the bee, and why fix the associated
coefficient to one?



On 26/01/2017 15:11, Tom Wenseleers wrote:
> Dear all,
> Just to ask a bit of advice about the correct way to specify a nested binomial GLM, in the context of estimating variance components / intraclass correlations to infer heritabilities of a behavioural trait.
> The behavioural trait is a binary one (eating an egg or not, 1 or 0), and is performed by known individual honeybees (individually numbered, “individual_ID”) of a known father line (“patriline”) of a given hive (“colony”). Several subsequent egg eating events could be performed by the same individuals. Of each experiment with each colony several trials were done, and for each trial we have data on how many eggs were eaten in total, so we could analyse as a dependent variable the proportion of those eggs that were eaten by a given individual. In addition, we also genotyped a bunch of bees of each colony, which gave us the patriline distribution within each colony (“expected_proportion_patriline”), i.e. the proportion that each patriline makes up in the colony, which I thought should affect the a priori probability that bees of a given patriline would be observed eating eggs.
> My question is what mixed model syntax would make most sense to analyse this data, and allows us to infer variance components and intraclass correlations as a basis for a heritability estimate of this egg eating behaviour?
> One model I thought of was to include the expected proportion of each patriline that is present as a custom offset, using
> library(afex)
> set_sum_contrasts() # use effect coding
> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
> fit1=glmer(cbind(eggs_eaten_by_individual, eggs_eaten_in_total_intrial -eggs_eaten_by_individual)~-1+(1|colony/patriline/ID), offset=baseline, data=data, family=binomial)
> but would this model make sense as a basis to estimate the variance components and intraclass correlation?
> In another model I worked with the mean eggs eaten by bees of a given patriline and then fitted the model
> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
> fit2=glmer(cbind(eggs_eaten_by_patriline, eggs_eaten_in_total_intrial – eggs_eaten_by_patriline)~-1+(1|colony/patriline), offset=baseline, data=data, family=binomial)
> Again though I am not sure if such a model would make sense, and neither of my two models take trial explicitly as a factor. Would anybody have any advice by any chance about the most sensible model given my experimental design (proportion data, with individuals nested in patriline nested in colony, with repeated trials and a priori info on the proportion of eggs that would be expected to be eaten by each patriline based on independent genotyping, which could perhaps be included as a custom intercept or a covariate)?
> Best regards,
> Tom Wenseleers
> Dept of Biology
> University of Leuven
> Belgium
> _____
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

R-sig-mixed-models at r-project.org mailing list

More information about the R-sig-mixed-models mailing list