[R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations

Fri Jan 27 06:37:55 CET 2017

Hi Tom,

If I understand your experiment/data set up, each row of the data frame 
contains all the data for one trial? If so, having trial as random 
effect is one way of modelling any overdispersion in the data with 
respect to the binomial. If overdispersion exists it is important to 
model this. Other than that the random effect structure seems fine.

Also I don't understand why baseline is fitted, especially as an offset. 
If you are fitting patriline as a random effect presumably you know the 
patriline for each bee? Why then fit the proportion of the colony that 
has the same patriline as the bee, and why fix the associated 
coefficient to one?

Cheers,

Jarrod

On 26/01/2017 15:11, Tom Wenseleers wrote:
> Dear all,
> Just to ask a bit of advice about the correct way to specify a nested binomial GLM, in the context of estimating variance components / intraclass correlations to infer heritabilities of a behavioural trait.
>
> The behavioural trait is a binary one (eating an egg or not, 1 or 0), and is performed by known individual honeybees (individually numbered, “individual_ID”) of a known father line (“patriline”) of a given hive (“colony”). Several subsequent egg eating events could be performed by the same individuals. Of each experiment with each colony several trials were done, and for each trial we have data on how many eggs were eaten in total, so we could analyse as a dependent variable the proportion of those eggs that were eaten by a given individual. In addition, we also genotyped a bunch of bees of each colony, which gave us the patriline distribution within each colony (“expected_proportion_patriline”), i.e. the proportion that each patriline makes up in the colony, which I thought should affect the a priori probability that bees of a given patriline would be observed eating eggs.
>
> My question is what mixed model syntax would make most sense to analyse this data, and allows us to infer variance components and intraclass correlations as a basis for a heritability estimate of this egg eating behaviour?
>
> One model I thought of was to include the expected proportion of each patriline that is present as a custom offset, using
> library(afex)
> set_sum_contrasts() # use effect coding
> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
> fit1=glmer(cbind(eggs_eaten_by_individual, eggs_eaten_in_total_intrial -eggs_eaten_by_individual)~-1+(1|colony/patriline/ID), offset=baseline, data=data, family=binomial)
> but would this model make sense as a basis to estimate the variance components and intraclass correlation?
>
> In another model I worked with the mean eggs eaten by bees of a given patriline and then fitted the model
> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
> fit2=glmer(cbind(eggs_eaten_by_patriline, eggs_eaten_in_total_intrial – eggs_eaten_by_patriline)~-1+(1|colony/patriline), offset=baseline, data=data, family=binomial)
>
> Again though I am not sure if such a model would make sense, and neither of my two models take trial explicitly as a factor. Would anybody have any advice by any chance about the most sensible model given my experimental design (proportion data, with individuals nested in patriline nested in colony, with repeated trials and a priori info on the proportion of eggs that would be expected to be eaten by each patriline based on independent genotyping, which could perhaps be included as a custom intercept or a covariate)?
>
> Best regards,
> Tom Wenseleers
> Dept of Biology
> University of Leuven
> Belgium
>
> _____
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.