[R-sig-ME] Choosing appropriate priors for bglmer mixed models in blme

Vincent Dorie vjd4 at nyu.edu
Tue Mar 10 16:14:51 CET 2015


Cauchy is just t with 1 degree of freedom.

Vince

> On Mar 10, 2015, at 3:15 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
> 
> Hi Josie,
> 
> Yes - I would scale your input variables and go for the t-prior in blme (I think the Cauchy prior is not implemented?). You might want to up the scale a little from that recommended in order to deal with the fact you may have non-zero random effects, but it might not make a big difference if their variance isn't too large.
> 
> Cheers,
> 
> Jarrod
> 
> 
> 
> Quoting Josie Galbraith <josie.galbraith at gmail.com> on Tue, 10 Mar 2015 10:30:12 +1300:
> 
>> Hi Jarrod,
>> 
>> I'm pretty sure it is a complete separation issue.  This is the xtab of
>> counts for the main factors:
>> 
>>                                LESION  0    1
>> SEASON     FOOD
>> Autumn            NF                 38   2
>>                          F                  21   0
>> Spring              NF                 27   3
>>                         F                   76  11
>> 
>> Lesion incidences were low generally, but particularly so in Autumn (and
>> fewer replicates in Autumn).
>> 
>> Thanks again,
>> Josie
>> 
>> 
>> 
>> 
>> 
>> On Mon, Mar 9, 2015 at 8:50 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>> 
>>> Hi Josie,
>>> 
>>> Is the problem you are having because of complete separation, either
>>> because you have some very good predictors of lesions and/or you have low
>>> replication for some factor levels? If so blmer with Gelman's recommended
>>> prior (not the diffuse prior) should do a reasonable job of allowing
>>> sensible inferences to be made. However, as Ben said in an earlier post,its
>>> not clear that this is the problem.
>>> 
>>> Similar issues are possible with the random effects, but this tends to be
>>> rare because they are constrained. I only see it when the variance
>>> component is very large, not zero as here.
>>> 
>>> If the perceived problem is zero variance estimates, I'm not sure why this
>>> is a problem. If the true variances are zero you should expect a MLE of
>>> zero 50% of the time. With only 8 levels of the random effect, you should
>>> expect an MLE of zero often, even if the true variance is moderate. The
>>> same power issues will generate MLE correlations of -1 and 1.
>>> 
>>> Cheers,
>>> 
>>> Jarrod
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Quoting Josie Galbraith <josie.galbraith at gmail.com> on Mon, 9 Mar 2015
>>> 13:08:59 +1300:
>>> 
>>> Thanks very much Jarrod & Vince for your inputs.
>>>> Admittedly this analysis is stretching my level of understanding!
>>>> 
>>>> From a practical point of view, given that time is of the essence in
>>>> writing up my PhD, if I only want to test the main effects of my model
>>>> (rather than make predictions etc), is this something I can achieve in
>>>> blme? (ie testing model terms using LRTs).  If so should I be using blme
>>>> for this?
>>>> 
>>>> Or should I really be working in MCMCglmm (which I haven't used before -
>>>> another learning curve!)?  Any further thoughts on using normal priors
>>>> rather than Cauchy?
>>>> 
>>>> Thanks again,
>>>> Josie
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Message: 2
>>>>> 
>>>>> Hi Vince,
>>>>> 
>>>>> For a given difference on the logit scale between (lets say) two
>>>>> treatment groups then the difference on the observed scale depends on
>>>>> the magnitude of the variance components. For logit effects beta1 and
>>>>> beta2, the expected difference is approximately:
>>>>> 
>>>>> plogis(beta1/sqrt(1+c2*v))-plogis(beta2/sqrt(1+c2*v))
>>>>> 
>>>>> where v is the variance component and c2 = (16*sqrt(3)/(15*pi))^2.
>>>>> 
>>>>> If a prior (Cauchy or otherwise) was set up that was invariant to v
>>>>> then it would imply different prior beliefs about the magnitude of the
>>>>> difference (on the observed scale) depending on v. For the normal
>>>>> prior it would imply that when v is large we should expect smaller
>>>>> differences between treatment groups. This maybe OK (I'm not sure) but
>>>>> if not is there a way to make it invariant for the t/Cauchy prior? For
>>>>> the normal you can make the scale = sqrt(v+pi^2/3) which seems to work
>>>>> OKish.
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Jarrod
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Quoting Vincent Dorie <vjd4 at nyu.edu> on Sat, 7 Mar 2015 09:47:40 -0500:
>>>>> 
>>>>> > Just to follow up on Gelman's Cauchy prior, it seems to work quite
>>>>> > well even in glmms. I don't have any theoretical results as of yet,
>>>>> > but if you look at the sampling distribution of the fixed effects
>>>>> > for any model, they cluster rather nicely. You get "sane" estimates
>>>>> > for when no kind of separation is involved, infinite (or convergence
>>>>> > failures) for complete/quasi complete separation, and a third group
>>>>> > exists with large estimates for when a group contains all 0s or 1s.
>>>>> > In the third case, a random effect can perfectly predict for that
>>>>> > group, but because they're integrated out the likelihood remains
>>>>> > well defined. You'll just get really large estimates of random
>>>>> > effects, which then go with large estimates of fixed effects.
>>>>> >
>>>>> > So long as you believe that some effect magnitudes for logistic
>>>>> > regression pretty much never happen in nature, the Cauchy prior does
>>>>> > a good job of pulling the extreme cases back down to earth while
>>>>> > leaving the well-estimated ones roughly in place. That being said,
>>>>> > using the priors in blme to patch up a data set is really only
>>>>> > advised for checking the viability of a model (usually one among
>>>>> > many, rapidly fit). After that, using something like MCMCglmm for a
>>>>> > fully Bayesian analysis is the way to go.
>>>>> >
>>>>> > Vince
>>>>> >
>>>>> >> On Mar 7, 2015, at 3:09 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
>>>>> wrote:
>>>>> >>
>>>>> >> Hi Josie,
>>>>> >>
>>>>> >> Regarding the priors on the fixed effects, if complete separation
>>>>> >> is the issue having a diffuse prior is not going to help. Gelman
>>>>> >> (2008) gives some recommendations about priors for logistic
>>>>> >> regression. Although a Cauchy-prior was considered better than a
>>>>> >> t-prior, the latter can be used in blmer and should alleviate
>>>>> >> complete separation issues. I tend to use a normal-prior after
>>>>> >> performing Gelman's rescaling, but this is mainly because MCMCglmm
>>>>> >> only handles normal priors for the fixed effects (this may not be
>>>>> >> true). In a hierarchical model I'm not sure Gelman's advice holds:
>>>>> >> at least with a normal-prior it makes sense to increase the prior
>>>>> >> variance as the random-effect variances increase. If the prior
>>>>> >> variance is approximately v+pi^2/3, where v is the sum of the
>>>>> >> variance components, then the effects on the probability scale are
>>>>> >> quite close to being uniform on the 0,1 interval.
>>>>> >>
>>>>> >> You can use the gelman.prior function to obtain the prior
>>>>> >> covariance matrix for your model. However, note that in the help
>>>>> >> file I say that the scale argument takes the standard deviation. In
>>>>> >> fact it takes the variance, but in the next version of MCMCglmm
>>>>> >> (coming soon) I have fixed this and it will take the standard
>>>>> >> deviation.
>>>>> >>
>>>>> >> Cheers,
>>>>> >>
>>>>> >> Jarrod
>>>>> >>
>>>>> >>
>>>>> >> Gelman, A. et al. (2008) The Annals of Appled Statistics 2 4 1360-1383
>>>>> >>
>>>>> >>
>>>>> >> Quoting Josie Galbraith <josie.galbraith at gmail.com> on Sat, 7 Mar
>>>>> >> 2015 12:15:41 +1300:
>>>>> >>
>>>>> >>> Thanks Ben,
>>>>> >>> I didn't have problems with singular estimates of variance components
>>>>> with
>>>>> >>> this data set.  However, I have a few other pathogens/parasites that
>>>>> I'm
>>>>> >>> looking at (I'm running separate models for each), and after looking
>>>>> at all
>>>>> >>> of them some do have zero variances for the random effect, either in
>>>>> >>> addition to large parameter estimates or alongside reasonable
>>>>> parameter
>>>>> >>> estimates.
>>>>> >>> Should I be also be imposing a covariance prior in either of these
>>>>> cases?
>>>>> >>>
>>>>> >>> As a related aside, my data are collected from individual birds -
>>>>> captured
>>>>> >>> over 4 sampling rounds (6 months apart).  While the majority of
>>>>> >>> observations are independent, there is a small proportion of birds
>>>>> that
>>>>> >>> were recaptured in a subsequent sampling round (between 2?15% of
>>>>> >>> observations, depending on which response variable).  I have modelled
>>>>> my
>>>>> >>> data both both with and without bird ID as a random effect.
>>>>> Including
>>>>> it
>>>>> >>> seems to cause more problems with zero variances.  Is this because
>>>>> too
>>>>> few
>>>>> >>> of the birds have actually been resampled?
>>>>> >>>
>>>>> >>> Cheers,
>>>>> >>> Josie
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>> Josie Galbraith <josie.galbraith at ...> writes:
>>>>> >>>>
>>>>> >>>> >
>>>>> >>>>
>>>>> >>>> [snip]
>>>>> >>>>
>>>>> >>>> >
>>>>> >>>> > I'm after some advice on how to choose which priors to use.  I
>>>>> gather I
>>>>> >>>> > need to impose a weak prior on the fixed effects of my model but
>>>>> no
>>>>> >>>> > covariance priors - is this correct?  Can I use a default prior
>>>>> (i.e. t,
>>>>> >>>> or
>>>>> >>>> > normal defaults in the blme package) or does it depend on my data?
>>>>> What
>>>>> >>>> is
>>>>> >>>> > considered a suitably weak prior?
>>>>> >>>>
>>>>> >>>>   If all you're trying to do is deal with complete separation (and
>>>>> not,
>>>>> >>>> e.g. singular estimates of variance components [typically indicated
>>>>> >>>> by zero variances or +/- 1 correlations, although I'm not sure those
>>>>> >>>> are necessary conditions for singularity]), then it should be OK
>>>>> >>>> to put the prior only on the fixed effects.  Generally speaking a
>>>>> >>>> weak prior is one with a standard deviation that is large relative
>>>>> >>>> to the expected scale of the effect (e.g. we might say sigma=10 is
>>>>> >>>> large, but it won't be if the units of measurement are very small
>>>>> >>>> so that a typical value of the mean is 100,000 ...)
>>>>> >>>>
>>>>> >>>> > I am running binomial models for epidemiology data (response
>>>>> variable is
>>>>> >>>> > presence/absence of lesions), with 2 fixed effects (FOOD: F/NF;
>>>>> SEASON:
>>>>> >>>> > Autumn/Spring) and a random effect (SITE: 8 levels).  The main
>>>>> goal
>>>>> of
>>>>> >>>> > these models is to test for an effect of the treatment 'FOOD.'
>>>>> I'm
>>>>> >>>> > guessing from what I've read, that my model should be something
>>>>> like the
>>>>> >>>> > following:
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> This seems fairly reasonable at first glance.  Where were you seeing
>>>>> >>>> the complete separation, though?   I would normally expect to
>>>>> >>>> see at least one of the parameters still being reasonably large
>>>>> >>>> if that's the case.
>>>>> >>>>
>>>>> >>>> > bglmer (LESION ~ FOOD*SEASON +(1|SITE), data = SEYE.df, family =
>>>>> >>>> binomial,
>>>>> >>>> > fixef.prior = normal, cov.prior = NULL)
>>>>> >>>> >
>>>>> >>>> > This is the output when I run the model:
>>>>> >>>> >
>>>>> >>>> > Fixef prior: normal(sd = c(10, 2.5, ...), corr = c(0 ...),
>>>>> >>>> common.scale =
>>>>> >>>> > FALSE)
>>>>> >>>> > Prior dev  : 18.2419
>>>>> >>>> >
>>>>> >>>> > Generalized linear mixed model fit by maximum likelihood (Laplace
>>>>> >>>> > Approximation) [
>>>>> >>>> > bglmerMod]
>>>>> >>>> >  Family: binomial ( logit )
>>>>> >>>> > Formula: LESION ~ FOOD * SEASON + (1 | SITE)
>>>>> >>>> >    Data: SEYE.df
>>>>> >>>> >
>>>>> >>>>
>>>>> >>>> [snip]
>>>>> >>>>
>>>>> >>>> > Random effects:
>>>>> >>>> >  Groups Name        Variance Std.Dev.
>>>>> >>>> >  SITE   (Intercept) 0.3064   0.5535
>>>>> >>>> > Number of obs: 178, groups: SITE, 8
>>>>> >>>> >
>>>>> >>>> > Fixed effects:
>>>>> >>>> >                     Estimate Std. Error z value Pr(>|z|)
>>>>> >>>> > (Intercept)          -3.7664     1.4551  -2.588  0.00964 **
>>>>> >>>> > FOODNF                0.5462     1.6838   0.324  0.74567
>>>>> >>>> > SEASONSpring          1.7529     1.4721   1.191  0.23378
>>>>> >>>> > FOODNF:SEASONSpring  -0.8151     1.7855  -0.456  0.64803
>>>>> >>>> > ---
>>>>> >>>> > Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>>>> >>>> >
>>>>> >>>>
>>>>> >>>> [snip]
>>>>> >>>>
>>>>> >>>> ------------------------------
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>> --
>>>>> >>> *Josie Galbraith* MSc (hons)
>>>>> >>>
>>>>> >>> PhD candidate
>>>>> >>> *University of Auckland *
>>>>> >>> Joint Graduate School in Biodiversity and Biosecurity ? School of
>>>>> >>> Biological Sciences ? Tamaki Campus ? Private Bag 92019 ? Auckland
>>>>> 1142* ?
>>>>> >>> P:* 09-373 7599 ext. 83132* ? E:* josie.galbraith at gmail.com* ? W: *
>>>>> UoA Web
>>>>> >>> Profile <https://unidirectory.auckland.ac.nz/profile/jgal026> and
>>>>> >>> *www.birdfeedingnz.weebly.com/* <http://birdfeedingnz.weebly.com/>
>>>>> >>>
>>>>> >>>     [[alternative HTML version deleted]]
>>>>> >>>
>>>>> >>> _______________________________________________
>>>>> >>> R-sig-mixed-models at r-project.org mailing list
>>>>> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> The University of Edinburgh is a charitable body, registered in
>>>>> >> Scotland, with registration number SC005336.
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> R-sig-mixed-models at r-project.org mailing list
>>>>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>> >
>>>>> > _______________________________________________
>>>>> > R-sig-mixed-models at r-project.org mailing list
>>>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>> >
>>>>> >
>>>>> 
>>>>> 
>>>>> --
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>> 
>>>>> 
>>>>> 
>>>>> ------------------------------
>>>>> 
>>>>> Subject: Digest Footer
>>>>> 
>>>>> _______________________________________________
>>>>> R-sig-mixed-models mailing list
>>>>> R-sig-mixed-models at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>> 
>>>>> 
>>>>> ------------------------------
>>>>> 
>>>>> End of R-sig-mixed-models Digest, Vol 99, Issue 11
>>>>> **************************************************
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> *Josie Galbraith* MSc (hons)
>>>> 
>>>> PhD candidate
>>>> *University of Auckland *
>>>> Joint Graduate School in Biodiversity and Biosecurity ● School of
>>>> Biological Sciences ● Tamaki Campus ● Private Bag 92019 ● Auckland 1142* ●
>>>> P:* 09-373 7599 ext. 83132* ● E:* josie.galbraith at gmail.com* ● W: * UoA
>>>> Web
>>>> Profile <https://unidirectory.auckland.ac.nz/profile/jgal026> and
>>>> *www.birdfeedingnz.weebly.com/* <http://birdfeedingnz.weebly.com/>
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> *Josie Galbraith* MSc (hons)
>> 
>> PhD candidate
>> *University of Auckland *
>> Joint Graduate School in Biodiversity and Biosecurity ● School of
>> Biological Sciences ● Tamaki Campus ● Private Bag 92019 ● Auckland 1142* ●
>> P:* 09-373 7599 ext. 83132* ● E:* josie.galbraith at gmail.com* ● W: * UoA Web
>> Profile <https://unidirectory.auckland.ac.nz/profile/jgal026> and
>> *www.birdfeedingnz.weebly.com/* <http://birdfeedingnz.weebly.com/>
>> 
> 
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> 



More information about the R-sig-mixed-models mailing list