[R-sig-ME] [R] Question on overdispersion

Jarrod Hadfield j.hadfield at ed.ac.uk
Fri Nov 19 10:52:00 CET 2010


Hi Thierry + nameless,

It is not necessary to expand the binomial into Bernoulli trials (nor  
advisable if n and/or the binomial size are large). You can just fit  
observation-level random effects:

dataset$resid<-as.factor(1:dim(dataset)[1])

fit3 <- glmer(cbind(male_chick_no, female_chick_no) ~ 1+(1|FemaleID)+  
(1|resid), data = dataset, family = binomial)

gives the same answer as fit2

Cheers,

Jarrod






Quoting "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>:

> Dear Nameless,
>
> The quasi distribution can no longer be used in lme4 because a) the
> results were not very reliable b) there is an alternative to model
> overdispersion.
>
> The alternative is to expand your dataset to bernoulli trials. Then add
> a random effect with one level per observation. This random effect will
> model additive overdisperion. The quasi distributions model
> overdisperion multiplicative.
>
> In the example below, the random effect of RowID has 0 variance. Hence
> no overdispersion.
>
> dataset <- data.frame(
> 	male_chick_no = c(2,4,1,0,3,5,2),
> 	female_chick_no=c(1,0,3,3,1,0,2),
> 	FemaleID=c("A","A","B","B","C","D","E"))
>
> longFormat <- do.call(rbind, lapply(seq_len(nrow(dataset)), function(i){
> 	with(dataset, data.frame(Sex = c(rep("M", male_chick_no[i]),
> rep("F", female_chick_no[i])), FemaleID = FemaleID[i]))
> }))
> longFormat$FemaleID <- factor(longFormat$FemaleID)
> longFormat$RowID <- factor(seq_len(nrow(longFormat)))
> longFormat$Male <- longFormat$Sex == "M"
>
> library(lme4)
> fit1 <- glmer(Male ~ (1|FemaleID), data = longFormat, family = binomial)
> fit2 <- glmer(Male ~ (1|FemaleID) + (1|RowID), data = longFormat, family
> = binomial)
> anova(fit1, fit2)
>
> Best regards,
>
> Thierry
>
> PS sig-mixed-models is a better mailinglist for this kind of questions.
>
> ------------------------------------------------------------------------
> ----
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
>
>> -----Oorspronkelijk bericht-----
>> Van: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] Namens cct663
>> Verzonden: vrijdag 19 november 2010 5:39
>> Aan: r-help at r-project.org
>> Onderwerp: [R] Question on overdispersion
>>
>>
>> I have a few questions relating to overdispersion in a sex
>> ratio data set that I am working with (note that I already
>> have an analysis with GLMMs for fixed effects, this is just
>> to estimate dispersion). The response variable is binomial
>> because nestlings can only be male or female. I have samples of
>> 1-5 nestlings from each nest (individuals within a nest are
>> not independent, so the response variable is the ratio of
>> sons to daughters) and some females have multiple nests in
>> the data set (so I need to include female identity as a
>> random effect).
>>
>> Here is an example of what the three vectors used in the
>> model look like (the real data set is much bigger, just to
>> illustrate what I'm talking
>> about):
>>
>> male_chick_no=c(2,4,1,0,3,5,2)
>> female_chick_no=c(1,0,3,3,1,0,2)
>> FemaleID=c(A,A,B,B,C,D,E)
>>
>> My first question relates to coding the test in R. I received
>> this suggested R syntax from a reviewer:
>>
>> SexRatio = cbind(male_chick_no, female_chick_no)
>>
>> Model <- lmer(SexRatio ~ 1 +(1|FemaleID), family = quasibinomial)
>>
>> But when I try to use this in R I get the error: "Error in
>> glmer(formula = ratio ~ 1 + (1 | femid), family =
>> quasibinomial) : "quasi" families cannot be used in glmer".
>>
>> I've tried playing around with some other mixed model
>> functions but can't seem to find one that will provide an
>> estimate of dispersion and allow me to include my random effect.
>>
>> Is there some other function I should be using? Or is there a
>> different syntax that I should use for lmer?
>>
>> My second question is more general: I understand that with
>> binomial data overdispersion suggests that the observed data
>> have a greater variance than expected given binomial errors
>> (in my case this means that more nests would be all male/all
>> female than expected if sex is random). So with binomial
>> errors the expected estimate of dispersion is 1, if I find
>> that dispersion is > 1 it suggests that my data are
>> overdispersed. My question is, how much greater than 1 should
>> that number be to conclude that the data are overdispersed?
>> Is there a rule of thumb or does it just depend on the dataset?
>>
>> I was thinking of doing a randomization test with the same
>> structure (nest size and female id) as my real data set but
>> with sex ratio of each nest randomized with a 50:50 chance of
>> individuals being sons or daughters and comparing my observed
>> dispersion to the distribution of dispersions from the
>> randomization test. Would this be a valid way to ask whether
>> my data are overdispersed? Is it even necessary?
>>
>> Any help/advice that you can provide would be greatly
>> appreciated. I am relatively new to R so explicit
>> instructions (i.e. easy to follow) would be wonderful.
>>
>> Thanks.
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Question-on-overdispersion-tp304
>> 9898p3049898.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the R-sig-mixed-models mailing list