[R-sig-ME] [R] Question on overdispersion
Thierry.ONKELINX at inbo.be
Fri Nov 19 10:59:11 CET 2010
Thanks for the information
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Research Institute for Nature and Forest
team Biometrics & Quality Assurance
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
~ John Tukey
> -----Oorspronkelijk bericht-----
> Van: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
> Verzonden: vrijdag 19 november 2010 10:52
> Aan: ONKELINX, Thierry
> CC: cct663; r-help at r-project.org; r-sig-mixed-models at r-project.org
> Onderwerp: Re: [R-sig-ME] [R] Question on overdispersion
> Hi Thierry + nameless,
> It is not necessary to expand the binomial into Bernoulli
> trials (nor advisable if n and/or the binomial size are
> large). You can just fit observation-level random effects:
> fit3 <- glmer(cbind(male_chick_no, female_chick_no) ~
> 1+(1|FemaleID)+ (1|resid), data = dataset, family = binomial)
> gives the same answer as fit2
> Quoting "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>:
> > Dear Nameless,
> > The quasi distribution can no longer be used in lme4 because a) the
> > results were not very reliable b) there is an alternative to model
> > overdispersion.
> > The alternative is to expand your dataset to bernoulli trials. Then
> > add a random effect with one level per observation. This
> random effect
> > will model additive overdisperion. The quasi distributions model
> > overdisperion multiplicative.
> > In the example below, the random effect of RowID has 0
> variance. Hence
> > no overdispersion.
> > dataset <- data.frame(
> > male_chick_no = c(2,4,1,0,3,5,2),
> > female_chick_no=c(1,0,3,3,1,0,2),
> > FemaleID=c("A","A","B","B","C","D","E"))
> > longFormat <- do.call(rbind, lapply(seq_len(nrow(dataset)),
> > with(dataset, data.frame(Sex = c(rep("M",
> male_chick_no[i]), rep("F",
> > female_chick_no[i])), FemaleID = FemaleID[i]))
> > }))
> > longFormat$FemaleID <- factor(longFormat$FemaleID)
> longFormat$RowID <-
> > factor(seq_len(nrow(longFormat))) longFormat$Male <-
> longFormat$Sex ==
> > "M"
> > library(lme4)
> > fit1 <- glmer(Male ~ (1|FemaleID), data = longFormat, family =
> > binomial)
> > fit2 <- glmer(Male ~ (1|FemaleID) + (1|RowID), data = longFormat,
> > family = binomial) anova(fit1, fit2)
> > Best regards,
> > Thierry
> > PS sig-mixed-models is a better mailinglist for this kind
> of questions.
> > --
> > ----
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek team Biometrie &
> > Gaverstraat 4 9500 Geraardsbergen Belgium
> > Research Institute for Nature and Forest team Biometrics & Quality
> > Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
> > tel. + 32 54/436 185
> > Thierry.Onkelinx at inbo.be
> > www.inbo.be
> > To call in the statistician after the experiment is done may be no
> > more than asking him to perform a post-mortem examination:
> he may be
> > able to say what the experiment died of.
> > ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data.
> > ~ Roger Brinner
> > The combination of some data and an aching desire for an
> answer does
> > not ensure that a reasonable answer can be extracted from a
> given body
> > of data.
> > ~ John Tukey
> >> -----Oorspronkelijk bericht-----
> >> Van: r-help-bounces at r-project.org
> >> [mailto:r-help-bounces at r-project.org] Namens cct663
> >> Verzonden: vrijdag 19 november 2010 5:39
> >> Aan: r-help at r-project.org
> >> Onderwerp: [R] Question on overdispersion
> >> I have a few questions relating to overdispersion in a sex
> ratio data
> >> set that I am working with (note that I already have an
> analysis with
> >> GLMMs for fixed effects, this is just to estimate dispersion). The
> >> response variable is binomial because nestlings can only
> be male or
> >> female. I have samples of
> >> 1-5 nestlings from each nest (individuals within a nest are not
> >> independent, so the response variable is the ratio of sons to
> >> daughters) and some females have multiple nests in the
> data set (so I
> >> need to include female identity as a random effect).
> >> Here is an example of what the three vectors used in the
> model look
> >> like (the real data set is much bigger, just to illustrate
> what I'm
> >> talking
> >> about):
> >> male_chick_no=c(2,4,1,0,3,5,2)
> >> female_chick_no=c(1,0,3,3,1,0,2)
> >> FemaleID=c(A,A,B,B,C,D,E)
> >> My first question relates to coding the test in R. I received this
> >> suggested R syntax from a reviewer:
> >> SexRatio = cbind(male_chick_no, female_chick_no)
> >> Model <- lmer(SexRatio ~ 1 +(1|FemaleID), family = quasibinomial)
> >> But when I try to use this in R I get the error: "Error in
> >> glmer(formula = ratio ~ 1 + (1 | femid), family =
> >> quasibinomial) : "quasi" families cannot be used in glmer".
> >> I've tried playing around with some other mixed model
> functions but
> >> can't seem to find one that will provide an estimate of dispersion
> >> and allow me to include my random effect.
> >> Is there some other function I should be using? Or is there a
> >> different syntax that I should use for lmer?
> >> My second question is more general: I understand that with
> >> data overdispersion suggests that the observed data have a greater
> >> variance than expected given binomial errors (in my case
> this means
> >> that more nests would be all male/all female than expected
> if sex is
> >> random). So with binomial errors the expected estimate of
> >> is 1, if I find that dispersion is > 1 it suggests that my
> data are
> >> overdispersed. My question is, how much greater than 1 should that
> >> number be to conclude that the data are overdispersed?
> >> Is there a rule of thumb or does it just depend on the dataset?
> >> I was thinking of doing a randomization test with the same
> >> (nest size and female id) as my real data set but with sex
> ratio of
> >> each nest randomized with a 50:50 chance of individuals
> being sons or
> >> daughters and comparing my observed dispersion to the
> distribution of
> >> dispersions from the randomization test. Would this be a
> valid way to
> >> ask whether my data are overdispersed? Is it even necessary?
> >> Any help/advice that you can provide would be greatly
> appreciated. I
> >> am relatively new to R so explicit instructions (i.e. easy
> to follow)
> >> would be wonderful.
> >> Thanks.
> >> --
> >> View this message in context:
> >> http://r.789695.n4.nabble.com/Question-on-overdispersion-tp304
> >> 9898p3049898.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> The University of Edinburgh is a charitable body, registered
> in Scotland, with registration number SC005336.
More information about the R-sig-mixed-models