[R-sig-ME] Bradley Terry GLMM in R ?

Fri Oct 28 08:17:27 CEST 2022

In case it's helpful to anyone following this email thread: I wrote a vignette explaining how to fit a Bradley-Terry model in lme4 using lmerMultiMember. You can find it at https://jvparidon.github.io/lmerMultiMember/articles/bradleyterry_models.html

Cheers,

JP van Paridon (he/him)
Research Associate, Lupyan Lab
University of Wisconsin-Madison
https://github.com/jvparidon
________________________________
From: Jeroen van Paridon <vanparidon using wisc.edu>
Sent: Thursday, October 20, 2022 1:42 AM
To: r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Bradley Terry GLMM in R ?

Hi,

Just to expand on Ben's last email: In principle, lmerMultiMember allows you to pass arbitrary indicator/weight matrices for the random effects to lme4 for model fitting as long as they have the correct shape. The package contains helper functions for generating more conventional multiple membership matrices since that was my own use-case, but if you create your own matrix with opposed (1 and -1) weights I see no reason why it shouldn't work.

Membership matrices need to be sparse matrices of class Matrix::dgCMatrix and shape n_groups x n_obs. You can probably just take whatever indicator matrix you already have, transpose it, and then cast it to the sparse format.

If you're going this route and run into any issues, feel free to reach out to me, directly.

Cheers,

JP van Paridon (he/him)
Research Associate, Lupyan Lab
University of Wisconsin-Madison
https://github.com/jvparidon
________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> on behalf of Ben Bolker <bbolker using gmail.com>
Sent: Wednesday, October 19, 2022 2:43 PM
To: r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Bradley Terry GLMM in R ?

On 2022-10-17 4:02 p.m., Shira Mitchell wrote:
> Questions for Ben Bolker about the excellent GLMM FAQ:
>
> Where does the hglm package fit into this very helpful table ?
> https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#what-methods-are-available-to-fit-estimate-glmms

   hglm uses a different *definition of likelihood*, so it's essentially
fitting a different model altogether (although one that will generally
get similar answers to other approaches).  Unfortunately, figuring out
the exact meaning and properties of h-likelihood will take you down a
rabbit hole ... (see refs below)
>
> I wonder also about differences in model formula specifications, since some
> packages (e.g. lme4) don't seem to accommodate Bradley-Terry, whereas some
> packages (e.g. INLA, hglm, MCMCglmm) can accommodate.
> https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#model-specification

    This is less conceptually difficult.  There's a big difference
between the models implemented within a package *could* fit, and what
the interfaces will allow.  For example, the machinery in lme4 *can* fit
any model with exponential-family conditional distribution and
multivariate Gaussian random effects (on the log scale). However, the
interface doesn't allow a lot of flexibility to specify the X and Z
matrices, which is what you would need in order to do Bradley-Terry
models.  There are extension packages such as
https://github.com/jvparidon/lmerMultiMember (which allow
multi-membership models and hence might be extended to do Bradley-Terry
models), gamm4 (on CRAN, imports machinery from mgcv to set up Z
matrices corresponding to penalized regression terms in the model), etc.

Meng, Xiao-Li. “Decoding the H-Likelihood.” Statistical Science 24, no.
3 (August 1, 2009). https://doi.org/10.1214/09-STS277C.

Meng, Xiao‐Li. “What’s the H in H‐likelihood: A Holy Grail or an
Achilles’ Heel?” In Bayesian Statistics 9, edited by José M. Bernardo,
M. J. Bayarri, James O. Berger, A. P. Dawid, David Heckerman, Adrian F.
M. Smith, and Mike West, 0. Oxford University Press, 2011.
https://doi.org/10.1093/acprof:oso/9780199694587.003.0016.

Jin, Shaobo, and Youngjo Lee. “A Review of H-Likelihood and Hierarchical
Generalized Linear Model.” WIREs Computational Statistics 13, no. 5
(2021): e1527. https://doi.org/10.1002/wics.1527.

>
>
>
>
> On Mon, Oct 17, 2022 at 3:55 PM Shira Mitchell <shiraqotj using gmail.com> wrote:
>
>> Thanks so much, Jarrod ! Not too late at all. Very interesting to compare
>> MCMC with the approximations (INLA, hglm's extended quasi likelihood). I
>> don't think I have the priors lined up yet across packages. The random
>> effects seem more dispersed according to MCMCglmm than in INLA or hglm, but
>> this could be due to priors not fit algorithm. Will look into the package
>> prior defaults.
>>
>> On Mon, Oct 17, 2022 at 4:45 AM Jarrod Hadfield <j.hadfield using ed.ac.uk>
>> wrote:
>>
>>> Hi Shira,
>>>
>>> Perhaps a little late to be useful, but MCMCglmm also fits random-effect
>>> Bradley-Terry models. Just specify ~mm(opponent1-opponent2) in the random
>>> effect formula. The mm stands for multimembership - the BT model is like a
>>> multimembership model where some effects have been multiplied by -1, hence
>>> the ‘-' rather than ‘+’ in the mm model formula.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>> On 16 Oct 2022, at 22:49, Shira Mitchell <shiraqotj using gmail.com> wrote:
>>>>
>>>> This email was sent to you by someone outside the University.
>>>> You should only click on links or attachments if you are certain that
>>> the email is genuine and the content is safe.
>>>>
>>>> Update: Dr Heather Turner <https://www.heatherturner.net/> (author of
>>>> BradleyTerry2) suggested the hglm package, which unlike lme4 allows you
>>> to
>>>> specify generic design matrices (no longer constrained to lme4 formulas
>>> !)
>>>> Results look really similar to INLA so far. Yay !
>>>>
>>>> On Sun, Oct 16, 2022 at 2:19 PM Shira Mitchell <shiraqotj using gmail.com>
>>> wrote:
>>>>
>>>>> Super helpful !  Thank you so much !
>>>>>
>>>>> Out of curiosity, is there a way to fit this type of Bradley-Terry
>>> model
>>>>> in lme4 ? lme4 formulas include random effect via syntax:
>>>>> https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf
>>>>> "(expr | factor). The expression expr is evaluated as a linear model
>>>>> formula, producing a model matrix following the same rules used in
>>> standard
>>>>> R modeling functions (e.g., `lm` or `glm`). The expression factor is
>>>>> evaluated as an `R` factor. One way to think about the vertical bar
>>>>> operator is as a special kind of interaction between the model matrix
>>> and
>>>>> the grouping factor. This interaction ensures that the columns of the
>>> model
>>>>> matrix have different effects for each level of the grouping factor."
>>>>>
>>>>> So (expr | factor) is X_expr * alpha_factor.
>>>>>
>>>>> So naively writing ~ (1 | m_1) + (1 | m_2) is alpha_{m_1}^{(1)} +
>>>>> alpha_{m_2}^{(2)}, twice as many parameters as what we want which is
>>>>> alpha_{m_1} - alpha_{m_2}.
>>>>>
>>>>> But then see this stackexchange:
>>>>>
>>>>>
>>> https://stats.stackexchange.com/questions/483833/opposing-effects-in-lme4-formulae-bradley-terry-model
>>>>> "I could just make a design matrix, where player 1 gets the value 1,
>>> and
>>>>> player 2 gets the value −1. However, unless I'm missing a trick, this
>>> would
>>>>> require having a separate column for each player, and plugging each
>>> player
>>>>> column's name into the formula"
>>>>>
>>>>> But suppose we create columns for all m = 1,...,M messages:
>>>>>
>>>>> A_m = 1 if m = m_1
>>>>>            -1 if m = m_2
>>>>>             0 otherwise
>>>>>
>>>>> I think then ~ (A_1 + ... + A_M  | m_1) is alpha_{m_1}^{(m_1)} -
>>>>> alpha_{m_1}^{(m_2)}, also not what we would want.
>>>>>
>>>>> Back to INLA. Suppose we now want to add random message-specific slopes
>>>>> for variable X_i in addition to random message-specific intercepts:
>>>>>
>>>>> P[i chooses m_1] = logit^-1 (beta_0 + (alpha_{m_1} - alpha_{m_2}) +
>>>>> (beta_{m_1} - beta_{m_2})X_i)
>>>>>
>>>>> alpha_1,...,alpha_M ~ N(0,sigma_intercept)
>>>>> beta_1,...,beta_M ~ N(0,sigma_slope)
>>>>>
>>>>> I see some resources about this, but nothing super comprehensive. Any
>>>>> advice where to look for complete documentation ?
>>>>>
>>>>>
>>> https://groups.google.com/g/r-inla-discussion-group/c/iQELaQF8M9Q/m/q7f4-W8YQksJ
>>>>> (
>>>>>
>>> https://becarioprecario.bitbucket.io/inla-gitbook/ch-multilevel.html#multilevel-models-for-longitudinal-data
>>>>> https://rpubs.com/corey_sparks/431920
>>>>> https://avianecologist.com/2016/10/05/multilevel-models/
>>>>>
>>>>> Here is what we did:
>>>>>
>>>>> data$w_X = -data$X
>>>>> data$m_1_beta = data$m_1
>>>>> data$m_2_beta = data$m_2
>>>>>
>>>>> inla(depvar ~  f(m_1, model="iid", values = issues) +
>>>>>                            f(m_2, w, copy = "m_1") +
>>>>>                            f(m_1_beta, X, model="iid", values = issues)
>>> +
>>>>>                            f(m_2_beta, w_X, copy = "m_1_beta"),
>>>>>                          family="binomial",
>>>>>                          data=data)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 10, 2022 at 4:24 AM Thierry Onkelinx <
>>> thierry.onkelinx using inbo.be>
>>>>> wrote:
>>>>>
>>>>>> Dear Shira,
>>>>>>
>>>>>> - in a formula object means remove that object from the formula. Use a
>>>>>> weight of -1 instead.
>>>>>>
>>>>>> f(home, model = "iid")) + f(away, w = -1, copy = "home")
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> ir. Thierry Onkelinx
>>>>>> Statisticus / Statistician
>>>>>>
>>>>>> Vlaamse Overheid / Government of Flanders
>>>>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>>>>> AND FOREST
>>>>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>>>>> thierry.onkelinx using inbo.be
>>>>>> Havenlaan 88 bus 73, 1000 Brussel
>>>>>> www.inbo.be<http://www.inbo.be>
>>>>>>
>>>>>>
>>>>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>>> To call in the statistician after the experiment is done may be no
>>> more
>>>>>> than asking him to perform a post-mortem examination: he may be able
>>> to say
>>>>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>>>> The combination of some data and an aching desire for an answer does
>>> not
>>>>>> ensure that a reasonable answer can be extracted from a given body of
>>> data.
>>>>>> ~ John Tukey
>>>>>>
>>>>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>>>
>>>>>> <https://www.inbo.be>
>>>>>>
>>>>>>
>>>>>> Op vr 7 okt. 2022 om 23:36 schreef Shira Mitchell <
>>> shiraqotj using gmail.com>:
>>>>>>
>>>>>>> Thanks so much, Thierry ! This is great.
>>>>>>>
>>>>>>> This works except that I cannot subtract because:
>>>>>>> f(home, model = "iid")) - f(away, copy = "home")
>>>>>>>
>>>>>>> just drops the second term. Apologies that I'm not super familiar
>>> with
>>>>>>> INLA syntax yet.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 7, 2022 at 10:19 AM Thierry Onkelinx <
>>>>>>> thierry.onkelinx using inbo.be> wrote:
>>>>>>>
>>>>>>>> Hi Shira,
>>>>>>>>
>>>>>>>> I fit such models with the INLA package (https://www.r-inla.org/).
>>> The
>>>>>>>> trick is to define two random effects but force their parameter
>>> estimates
>>>>>>>> to be identical.
>>>>>>>>
>>>>>>>> The code would contain something like f(home, model = "iid")) +
>>> f(away,
>>>>>>>> copy = "home"). Meaning home ~ N(0, sigma_beta_i) and home[i] =
>>> away[i]
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> ir. Thierry Onkelinx
>>>>>>>> Statisticus / Statistician
>>>>>>>>
>>>>>>>> Vlaamse Overheid / Government of Flanders
>>>>>>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR
>>> NATURE
>>>>>>>> AND FOREST
>>>>>>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality
>>> Assurance
>>>>>>>> thierry.onkelinx using inbo.be
>>>>>>>> Havenlaan 88 bus 73, 1000 Brussel
>>>>>>>> www.inbo.be<http://www.inbo.be>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>>>>> To call in the statistician after the experiment is done may be no
>>> more
>>>>>>>> than asking him to perform a post-mortem examination: he may be
>>> able to say
>>>>>>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>>>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>>>>>> The combination of some data and an aching desire for an answer does
>>>>>>>> not ensure that a reasonable answer can be extracted from a given
>>> body of
>>>>>>>> data. ~ John Tukey
>>>>>>>>
>>>>>>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>>>>>
>>>>>>>> <https://www.inbo.be>
>>>>>>>>
>>>>>>>>
>>>>>>>> Op vr 7 okt. 2022 om 15:00 schreef Shira Mitchell <
>>> shiraqotj using gmail.com
>>>>>>>>> :
>>>>>>>>
>>>>>>>>> We want to fit Bradley-Terry-style GLMM models in R. We looked
>>> into:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>> https://cran.r-project.org/web/packages/BradleyTerry2/vignettes/BradleyTerry.pdf
>>>>>>>>> and
>>>>>>>>> http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
>>>>>>>>>
>>>>>>>>> We have voter-specific variables x that influence which political
>>>>>>>>> message
>>>>>>>>> (i vs j) wins for them:
>>>>>>>>>
>>>>>>>>> logit[pr(i beats j | person with covariate x)] = lambda_i -
>>> lambda_j +
>>>>>>>>> (beta_i - beta_j) x
>>>>>>>>>
>>>>>>>>> We then model parameters as random effects:
>>>>>>>>> lambda_i ~ N(0, sigma_lambda)
>>>>>>>>> beta_i ~ N(0, sigma_beta)
>>>>>>>>>
>>>>>>>>> Is there a way to do this in R ? We do this in TensorFlow in
>>> Python by
>>>>>>>>> directly specifying design matrices with the 0,-1,1 or 0,-x,x
>>> entries.
>>>>>>>>> However, I do not see how to do this in R using lme4,
>>> BradleyTerry2,
>>>>>>>>> mgcv,
>>>>>>>>> etc.
>>>>>>>>>
>>>>>>>>> Thanks so much,
>>>>>>>>> Shira
>>>>>>>>>
>>>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> R-sig-mixed-models using r-project.org mailing list
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>>>
>>>>>>>>
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>> The University of Edinburgh is a charitable body, registered in Scotland,
>>> with registration number SC005336. Is e buidheann carthannais a th’ ann an
>>> Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
>>>
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.

_______________________________________________
R-sig-mixed-models using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]