[R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations
Jarrod Hadfield
j.hadfield at ed.ac.uk
Sat Jan 28 15:04:26 CET 2017
Hi,
As long as you code each colony, patriline, ID and trial uniquely (for
example, you don't call the first bees from two different patrilines
bee1) then you don't need to be explicit about the nesting:
(1|colony)+(1|patriline)+(1|ID) = (1|colony/patriline/ID). It used to be
useful for computational reasons to explicitly state that the design was
nested, but now most software (lmer/asreml/MCMCglmm) use algorithms for
detecting this structure to determine a good computational strategy.
You can also fit (1|colony)+(1|patriline)+(1|ID)+(1|trial)+(1|obs) where
obs is a factor with a different level for each row. It accounts for
overdispersion. How many of these random effect you choose to model is
your decision.
Are there a fixed number of eggs and multiple bees are trying to eat
them, or is each bee assayed alone? If the former, the response
variables might not conform to a binomial but a multinomial.
The patriline variance is accounting for any imbalance in the data, so
it is estimating the variance had there been no skew. I still don't
understand why you want to include the proportional distribution of the
different patrilines in the model, particularly as an offset? Why would
it influence what a single bee with known patriline does - because it is
more likely to eat an egg that belongs to a different patriline? If so,
I understand, but I would just have it in as a standard covariate.
Cheers,
Jarrod
On 27/01/2017 14:42, Tom Wenseleers wrote:
> Hi Jarrod,
> In the meantime I think I found what I was doing wrong - I should have included all the control bees that we genotyped and which did not eat any eggs as well in the dataset, specifying they ate 0 eggs, and then I should have specified the standard nested mixed model
> fit1=glmer(cbind(eggs_eaten_by_individual, eggs_eaten_in_total_intrial -eggs_eaten_by_individual)~-(1|colony/patriline/ID), data=data, family=binomial)
>
> The only residual query is whether trial should be included as well, perhaps as a crossed random factor +(1|trial) in the model?
>
> best regards,
> Tom
>
> ________________________________________
> From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on behalf of Tom Wenseleers <tom.wenseleers at kuleuven.be>
> Sent: 27 January 2017 13:45
> To: Jarrod Hadfield; r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations
>
> Hi Jarrod,
> Well the structure of the datafile is such that each row of the datafile contains the proportion of the eggs eaten in each trial by each individual bee (that belongs to a particular patriline, ie there would be several lines per trial corresponding to different individual bees, some of which might also occur again in different trials with the same colony) (this is in fit1 below) or the mean proportion of the eggs eaten in each trial by a patricular patriline (in fit2 below). So I was not entirely sure how I should still incorporate trial as a random effect as well, and how this would look like?
> The reason that I thought it would make sense to also include the proportional distribution of the different patrilines in the colony (either as a custom intercept or a covariate, not sure about that) in the model is that these give the a priori probability that eggs would be eaten by bees belonging to different patrilines (as they are heavily skewed), so the heritability of the trait would be expressed mainly in terms of some patrilines making a much greater or much lower contribution to egg eating than expected based on their proportional presence in the colony. But I am not sure how I would correctly do that in my model, hence my question? Makes sense?
>
> cheers,
> Tom
>
> ________________________________________
> From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on behalf of Jarrod Hadfield <j.hadfield at ed.ac.uk>
> Sent: 27 January 2017 06:37
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Correct specification of nested binomial mixed model with custom intercept to infer variance components and intraclass correlations
>
> Hi Tom,
>
> If I understand your experiment/data set up, each row of the data frame
> contains all the data for one trial? If so, having trial as random
> effect is one way of modelling any overdispersion in the data with
> respect to the binomial. If overdispersion exists it is important to
> model this. Other than that the random effect structure seems fine.
>
> Also I don't understand why baseline is fitted, especially as an offset.
> If you are fitting patriline as a random effect presumably you know the
> patriline for each bee? Why then fit the proportion of the colony that
> has the same patriline as the bee, and why fix the associated
> coefficient to one?
>
> Cheers,
>
> Jarrod
>
>
>
> On 26/01/2017 15:11, Tom Wenseleers wrote:
>> Dear all,
>> Just to ask a bit of advice about the correct way to specify a nested binomial GLM, in the context of estimating variance components / intraclass correlations to infer heritabilities of a behavioural trait.
>>
>> The behavioural trait is a binary one (eating an egg or not, 1 or 0), and is performed by known individual honeybees (individually numbered, “individual_ID”) of a known father line (“patriline”) of a given hive (“colony”). Several subsequent egg eating events could be performed by the same individuals. Of each experiment with each colony several trials were done, and for each trial we have data on how many eggs were eaten in total, so we could analyse as a dependent variable the proportion of those eggs that were eaten by a given individual. In addition, we also genotyped a bunch of bees of each colony, which gave us the patriline distribution within each colony (“expected_proportion_patriline”), i.e. the proportion that each patriline makes up in the colony, which I thought should affect the a priori probability that bees of a given patriline would be observed eating eggs.
>>
>> My question is what mixed model syntax would make most sense to analyse this data, and allows us to infer variance components and intraclass correlations as a basis for a heritability estimate of this egg eating behaviour?
>>
>> One model I thought of was to include the expected proportion of each patriline that is present as a custom offset, using
>> library(afex)
>> set_sum_contrasts() # use effect coding
>> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
>> fit1=glmer(cbind(eggs_eaten_by_individual, eggs_eaten_in_total_intrial -eggs_eaten_by_individual)~-1+(1|colony/patriline/ID), offset=baseline, data=data, family=binomial)
>> but would this model make sense as a basis to estimate the variance components and intraclass correlation?
>>
>> In another model I worked with the mean eggs eaten by bees of a given patriline and then fitted the model
>> data$baseline=qlogis(data$expected_proportion_patriline) # custom intercept (qlogis=logit)
>> fit2=glmer(cbind(eggs_eaten_by_patriline, eggs_eaten_in_total_intrial – eggs_eaten_by_patriline)~-1+(1|colony/patriline), offset=baseline, data=data, family=binomial)
>>
>> Again though I am not sure if such a model would make sense, and neither of my two models take trial explicitly as a factor. Would anybody have any advice by any chance about the most sensible model given my experimental design (proportion data, with individuals nested in patriline nested in colony, with repeated trials and a priori info on the proportion of eggs that would be expected to be eaten by each patriline based on independent genotyping, which could perhaps be included as a custom intercept or a covariate)?
>>
>> Best regards,
>> Tom Wenseleers
>> Dept of Biology
>> University of Leuven
>> Belgium
>>
>> _____
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list