[R-sig-ME] Bradley Terry GLMM in R ?

Mon Dec 12 02:55:38 CET 2022

Hi Shira,

I'll just answer your questions in order:

  1.
For the generalized linear mixed model presented in the lmerMultiMember vignette, both anova() and lmtest::lrtest() perform a Chi-square test, so there's no meaningful difference other than that anova() also returns AIC and BIC statistics. I wasn't intending to discuss AIC and BIC in the vignette, so I just used lrtest() for its simpler results table :-)
I may reconsider that decision however, because there are other cases where anova() and lrtest() will return different results or even different tests, and anova() usually has the more sensible default behavior. Two examples I am aware of:

A) For simple linear models using lm(), it seems that lrtest() performs a Chi-square test, while anova() performs an F-test. (See https://stats.stackexchange.com/questions/535709/anova-vs-likelihood-ratio-test-different-result)
I believe anova() does not perform the F-test for mixed-effects models because estimating degrees of freedom for these models is not straightforward.

B) For linear mixed effects models that return lmerMod (so lmer() rather than glmer()), the default behavior of anova() is to refit REML-fit models with ML before comparing them, whereas lrtest() will just compare the REML-fit models without warning. (As I understand it, comparing REML-fit models is considered acceptable if the models only differ in their random effects, but wrong if the models differ in their fixed effects.)

  2.  The summary output for the group memberships is something I implemented as a quick diagnostic feature for multiple membership models. If you know all your observations are supposed to have e.g. 2 memberships, or 4 memberships on average but never more than 8, this information is easy to verify in the model summary. Strictly speaking I guess these are properties of the indicator matrix rather than the model, but I wanted to put them in a place where people are likely to see it if there is something wrong with their indicator matrix!

I didn't consider Bradley-Terry models when I first implemented the summary, but it makes sense that the summary stats all return 0: The indicators for each observation are supposed to sum to 0 for B-T models, so for these models non-zero min/mean/max values would indicate there is a problem with your indicator matrix.
On my to-do list I have adding a "mean count of non-zero values per observation" stat to the summary, which should make diagnosing problems with B-T indicator matrices easier.

  3.  That is exactly how I would go about predicting unobserved combinations of predictors, for now. One small tip for automatically looking up random effects when you're using matrices for nested grouping factors: interaction_weights() combines grouping levels using a period as separator. So if you've used interaction_weights(color, shape) the some of the levels in the indicator matrix would be red.square, green.circle, etc.

I hope this helps!
I really appreciate the feedback about usability/vignettes/features/etc.

Cheers,

JP

________________________________
From: Shira Mitchell <shiraqotj using gmail.com>
Sent: Sunday, December 11, 2022 8:45 AM
To: Jeroen van Paridon <vanparidon using wisc.edu>
Cc: JP van Paridon <jvparidon using gmail.com>; r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Bradley Terry GLMM in R ?

Hi JP !

This is all super helpful. Follow-up questions:

1. What is the difference between
anova(m_extra, m) as used in http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects
and
lmtest::lrtest(m_extra, m) as used in https://github.com/jvparidon/lmerMultiMember/blob/main/vignettes/lmermultimember_intro.Rmd
?

2. In summary(m) output, why is "Group memberships per observation for multiple membership REs:" seem to always be all zeros ?
https://jvparidon.github.io/lmerMultiMember/articles/bradleyterry_models.html#sanity-check-on-random-intercepts-team-strength

3. We need to generate predictions for synthetic cases:
pr(message i beats a hypothetical average message | person with covariate x) = logit^-1 ( lambda_i + beta_i x)
If I understand correctly, this isn't yet implemented ? So I wrote a clunky ad hoc version for my model that pulls lambda_i and beta_i from ranef(m) and multiplies by covariate x matrices and takes inverse logit to get the above.

Thanks again,
Shira

On Fri, Dec 9, 2022 at 9:22 PM Jeroen van Paridon <vanparidon using wisc.edu<mailto:vanparidon using wisc.edu>> wrote:
Hi Shira,

I'm glad you're finding it useful!

If I understand correctly, the model you've run already includes continuous covariates (e.g. (x | indicators)), so I think your question is mostly about the categorical groups?

Your intuition to specify them as random effects using the interaction notation (e.g. (1 | indicators:group)) would be correct for lme4, but unfortunately the lmerMultiMember way of specifying these interactions is a little bit messier (there's some backend complexities that prevent me from using the same syntax as lme4). Instead, you'll have to pre-generate the indicator matrices for the interaction groupings using the interaction_weights() function. There's a worked example of how to do this in the lmerMultiMember vignettes (here: https://jvparidon.github.io/lmerMultiMember/articles/lmermultimember_intro.html#using-nestedinteraction-multiple-membership-to-find-the-player-with-the-strongest-year-of-the-2010s) but the basic steps are as follows:

  1.  Generate a Bradley-Terry matrix Wbt for your messages using bradleyterry_from_sparse() or whatever other method you're using.
  2.  Create a sparse matrix Wg for the grouping factor you want to nest by, using Matrix::fac2sparse(group).
  3.  Generate the interaction matrix Wbtxg using interaction_weights(Wbt, Wg) and then use that as your indicator matrix.

I tend to name the interaction matrix dummies the same way I would specify interactions, with an X replacing the colon, so indicators:group would become indicatorsXgroup. This makes it a little easier to keep track of what the different dummies in the model formula mean.

If you have multiple categorical factors that you want to use as random effects groupings (e.g. (1 | indicatorsXgender) + (1 | indicatorsXemployment_status)) you can create separate interaction matrices for those indicator:grouping interactions.
In most cases, it would make sense to also include the indicators for the messages only, so your formula might end up looking something like (1 + age | indicators) + (1 | indicatorsXgender) + (1 | indicatorsXemployment_status), for example.

I hope this explanation makes sense, but do let me know if it doesn't! (I don't talk about Bradley-Terry models very often, so my terminology and notation may be a bit off.)
Two things to be aware of as you're doing this:

  1.  I haven't gotten around to optimizing the interaction_weights() function, so it's a pretty slow implementation. Generating a matrix for a large dataset could take a few minutes! (I'm hopeful I'll get around to fixing that over the holidays...)
  2.  When you make interaction matrices and add covariates, the number of random effects levels tends to explode, and it's very easy for the number of random effects levels that would need to be estimated to exceed the number of observations in your data. That tends to make the model unidentifiable, meaning that lmerMultiMember can't fit it. I've programmed the package so that it throws an error if this happens, just to be safe.

As for the prediction question: The standard lme4 predict() method should work, so for getting predictions on the training data you can just call:
m <- lmerMultiMember::glmer(...)
predict(m)

Which will return a vector of predictions, in the order of your original data. You can then use that for poststratification, etc.

Unfortunately there is not yet a good method to generate predictions for synthetic cases (i.e. cases with a combination of predictor levels not observed in your training data). Statistically I think this should be possible, but unfortunately the technical implementation would be quite a bit of work, so I haven't gotten around to it.

I hope this helps. Do let me know if any part of it was unclear!

Cheers,

JP
________________________________
From: Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>>
Sent: Friday, December 9, 2022 8:53 AM
To: JP van Paridon <jvparidon using gmail.com<mailto:jvparidon using gmail.com>>; Jeroen van Paridon <vanparidon using wisc.edu<mailto:vanparidon using wisc.edu>>
Cc: r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org> <r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>>
Subject: Re: [R-sig-ME] Bradley Terry GLMM in R ?

Hi Dr van Paridon,

Thank you so much !

We are returning to this after our busy election season. We are using your awesome lmerMultiMember package and have questions.

We have voter-specific variables x that influence which political message (i vs j) wins for them:

logit[pr(i beats j | person with covariate x)] = lambda_i - lambda_j + (beta_i - beta_j) x

We then model parameters as random effects:
lambda_i ~ N(0, sigma_lambda)
beta_i ~ N(0, sigma_beta)

m <- lmerMultiMember::glmer(depvar ~ 1 + (1 | indicators) + (x | indicators),
                                                    family = binomial,
                                                    memberships = list(indicators = W),
                                                    data = dat_train)

This runs beautifully. :)

Now suppose we want the strength of message i among people with covariates x (e.g. a specific age). In reality we have a few covariates, both continuous (x | indicators) and categorical groups (1 | indicators:group).

pr(i beats a hypothetical average message | person with covariate x) = logit^-1 ( lambda_i + beta_i x)

We have a data set that crosses all population x values with all messages, dat_population_all_messages.

Also, if we want to predict specific match-ups from the training data dat_train, how do we do that ?

Thanks again !!

Shira

---------

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2022q4/030224.html

In case it's helpful to anyone following this email thread: I wrote a vignette explaining how to fit a Bradley-Terry model in lme4 using lmerMultiMember. You can find it at https://jvparidon.github.io/lmerMultiMember/articles/bradleyterry_models.html

Cheers,

JP van Paridon (he/him)
Research Associate, Lupyan Lab
University of Wisconsin-Madison
https://github.com/jvparidon
________________________________
From: Jeroen van Paridon <vanparidon using wisc.edu<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
Sent: Thursday, October 20, 2022 1:42 AM
To: r-sig-mixed-models using r-project.org<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models> <r-sig-mixed-models using r-project.org<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
Subject: Re: [R-sig-ME] Bradley Terry GLMM in R ?

Hi,

Just to expand on Ben's last email: In principle, lmerMultiMember allows you to pass arbitrary indicator/weight matrices for the random effects to lme4 for model fitting as long as they have the correct shape. The package contains helper functions for generating more conventional multiple membership matrices since that was my own use-case, but if you create your own matrix with opposed (1 and -1) weights I see no reason why it shouldn't work.

Membership matrices need to be sparse matrices of class Matrix::dgCMatrix and shape n_groups x n_obs. You can probably just take whatever indicator matrix you already have, transpose it, and then cast it to the sparse format.

If you're going this route and run into any issues, feel free to reach out to me, directly.

Cheers,

JP van Paridon (he/him)
Research Associate, Lupyan Lab
University of Wisconsin-Madison
https://github.com/jvparidon
________________________________

On Mon, Oct 17, 2022 at 10:02 PM Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>> wrote:
Questions for Ben Bolker about the excellent GLMM FAQ:

Where does the hglm package fit into this very helpful table ?
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#what-methods-are-available-to-fit-estimate-glmms

I wonder also about differences in model formula specifications, since some packages (e.g. lme4) don't seem to accommodate Bradley-Terry, whereas some packages (e.g. INLA, hglm, MCMCglmm) can accommodate.
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#model-specification

On Mon, Oct 17, 2022 at 3:55 PM Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>> wrote:
Thanks so much, Jarrod ! Not too late at all. Very interesting to compare MCMC with the approximations (INLA, hglm's extended quasi likelihood). I don't think I have the priors lined up yet across packages. The random effects seem more dispersed according to MCMCglmm than in INLA or hglm, but this could be due to priors not fit algorithm. Will look into the package prior defaults.

On Mon, Oct 17, 2022 at 4:45 AM Jarrod Hadfield <j.hadfield using ed.ac.uk<mailto:j.hadfield using ed.ac.uk>> wrote:
Hi Shira,

Perhaps a little late to be useful, but MCMCglmm also fits random-effect Bradley-Terry models. Just specify ~mm(opponent1-opponent2) in the random effect formula. The mm stands for multimembership - the BT model is like a multimembership model where some effects have been multiplied by -1, hence the ‘-' rather than ‘+’ in the mm model formula.

Cheers,

Jarrod

> On 16 Oct 2022, at 22:49, Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>> wrote:
>
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> Update: Dr Heather Turner <https://www.heatherturner.net/> (author of
> BradleyTerry2) suggested the hglm package, which unlike lme4 allows you to
> specify generic design matrices (no longer constrained to lme4 formulas !)
> Results look really similar to INLA so far. Yay !
>
> On Sun, Oct 16, 2022 at 2:19 PM Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>> wrote:
>
>> Super helpful !  Thank you so much !
>>
>> Out of curiosity, is there a way to fit this type of Bradley-Terry model
>> in lme4 ? lme4 formulas include random effect via syntax:
>> https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf
>> "(expr | factor). The expression expr is evaluated as a linear model
>> formula, producing a model matrix following the same rules used in standard
>> R modeling functions (e.g., `lm` or `glm`). The expression factor is
>> evaluated as an `R` factor. One way to think about the vertical bar
>> operator is as a special kind of interaction between the model matrix and
>> the grouping factor. This interaction ensures that the columns of the model
>> matrix have different effects for each level of the grouping factor."
>>
>> So (expr | factor) is X_expr * alpha_factor.
>>
>> So naively writing ~ (1 | m_1) + (1 | m_2) is alpha_{m_1}^{(1)} +
>> alpha_{m_2}^{(2)}, twice as many parameters as what we want which is
>> alpha_{m_1} - alpha_{m_2}.
>>
>> But then see this stackexchange:
>>
>> https://stats.stackexchange.com/questions/483833/opposing-effects-in-lme4-formulae-bradley-terry-model
>> "I could just make a design matrix, where player 1 gets the value 1, and
>> player 2 gets the value −1. However, unless I'm missing a trick, this would
>> require having a separate column for each player, and plugging each player
>> column's name into the formula"
>>
>> But suppose we create columns for all m = 1,...,M messages:
>>
>> A_m = 1 if m = m_1
>>           -1 if m = m_2
>>            0 otherwise
>>
>> I think then ~ (A_1 + ... + A_M  | m_1) is alpha_{m_1}^{(m_1)} -
>> alpha_{m_1}^{(m_2)}, also not what we would want.
>>
>> Back to INLA. Suppose we now want to add random message-specific slopes
>> for variable X_i in addition to random message-specific intercepts:
>>
>> P[i chooses m_1] = logit^-1 (beta_0 + (alpha_{m_1} - alpha_{m_2}) +
>> (beta_{m_1} - beta_{m_2})X_i)
>>
>> alpha_1,...,alpha_M ~ N(0,sigma_intercept)
>> beta_1,...,beta_M ~ N(0,sigma_slope)
>>
>> I see some resources about this, but nothing super comprehensive. Any
>> advice where to look for complete documentation ?
>>
>> https://groups.google.com/g/r-inla-discussion-group/c/iQELaQF8M9Q/m/q7f4-W8YQksJ
>> (
>> https://becarioprecario.bitbucket.io/inla-gitbook/ch-multilevel.html#multilevel-models-for-longitudinal-data
>> https://rpubs.com/corey_sparks/431920
>> https://avianecologist.com/2016/10/05/multilevel-models/
>>
>> Here is what we did:
>>
>> data$w_X = -data$X
>> data$m_1_beta = data$m_1
>> data$m_2_beta = data$m_2
>>
>> inla(depvar ~  f(m_1, model="iid", values = issues) +
>>                           f(m_2, w, copy = "m_1") +
>>                           f(m_1_beta, X, model="iid", values = issues) +
>>                           f(m_2_beta, w_X, copy = "m_1_beta"),
>>                         family="binomial",
>>                         data=data)
>>
>>
>>
>>
>> On Mon, Oct 10, 2022 at 4:24 AM Thierry Onkelinx <thierry.onkelinx using inbo.be<mailto:thierry.onkelinx using inbo.be>>
>> wrote:
>>
>>> Dear Shira,
>>>
>>> - in a formula object means remove that object from the formula. Use a
>>> weight of -1 instead.
>>>
>>> f(home, model = "iid")) + f(away, w = -1, copy = "home")
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Statisticus / Statistician
>>>
>>> Vlaamse Overheid / Government of Flanders
>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>> AND FOREST
>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>> thierry.onkelinx using inbo.be<mailto:thierry.onkelinx using inbo.be>
>>> Havenlaan 88 bus 73, 1000 Brussel
>>> www.inbo.be<http://www.inbo.be>
>>>
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>
>>> <https://www.inbo.be>
>>>
>>>
>>> Op vr 7 okt. 2022 om 23:36 schreef Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>>:
>>>
>>>> Thanks so much, Thierry ! This is great.
>>>>
>>>> This works except that I cannot subtract because:
>>>> f(home, model = "iid")) - f(away, copy = "home")
>>>>
>>>> just drops the second term. Apologies that I'm not super familiar with
>>>> INLA syntax yet.
>>>>
>>>>
>>>>
>>>> On Fri, Oct 7, 2022 at 10:19 AM Thierry Onkelinx <
>>>> thierry.onkelinx using inbo.be<mailto:thierry.onkelinx using inbo.be>> wrote:
>>>>
>>>>> Hi Shira,
>>>>>
>>>>> I fit such models with the INLA package (https://www.r-inla.org/). The
>>>>> trick is to define two random effects but force their parameter estimates
>>>>> to be identical.
>>>>>
>>>>> The code would contain something like f(home, model = "iid")) + f(away,
>>>>> copy = "home"). Meaning home ~ N(0, sigma_beta_i) and home[i] = away[i]
>>>>>
>>>>> Best regards,
>>>>>
>>>>> ir. Thierry Onkelinx
>>>>> Statisticus / Statistician
>>>>>
>>>>> Vlaamse Overheid / Government of Flanders
>>>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>>>> AND FOREST
>>>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>>>> thierry.onkelinx using inbo.be<mailto:thierry.onkelinx using inbo.be>
>>>>> Havenlaan 88 bus 73, 1000 Brussel
>>>>> www.inbo.be<http://www.inbo.be>
>>>>>
>>>>>
>>>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>> To call in the statistician after the experiment is done may be no more
>>>>> than asking him to perform a post-mortem examination: he may be able to say
>>>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>>> The combination of some data and an aching desire for an answer does
>>>>> not ensure that a reasonable answer can be extracted from a given body of
>>>>> data. ~ John Tukey
>>>>>
>>>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>>>
>>>>> <https://www.inbo.be>
>>>>>
>>>>>
>>>>> Op vr 7 okt. 2022 om 15:00 schreef Shira Mitchell <shiraqotj using gmail.com<mailto:shiraqotj using gmail.com>
>>>>>> :
>>>>>
>>>>>> We want to fit Bradley-Terry-style GLMM models in R. We looked into:
>>>>>>
>>>>>>
>>>>>> https://cran.r-project.org/web/packages/BradleyTerry2/vignettes/BradleyTerry.pdf
>>>>>> and
>>>>>> http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
>>>>>>
>>>>>> We have voter-specific variables x that influence which political
>>>>>> message
>>>>>> (i vs j) wins for them:
>>>>>>
>>>>>> logit[pr(i beats j | person with covariate x)] = lambda_i - lambda_j +
>>>>>> (beta_i - beta_j) x
>>>>>>
>>>>>> We then model parameters as random effects:
>>>>>> lambda_i ~ N(0, sigma_lambda)
>>>>>> beta_i ~ N(0, sigma_beta)
>>>>>>
>>>>>> Is there a way to do this in R ? We do this in TensorFlow in Python by
>>>>>> directly specifying design matrices with the 0,-1,1 or 0,-x,x entries.
>>>>>> However, I do not see how to do this in R using lme4, BradleyTerry2,
>>>>>> mgcv,
>>>>>> etc.
>>>>>>
>>>>>> Thanks so much,
>>>>>> Shira
>>>>>>
>>>>>>        [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>
>>>>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

	[[alternative HTML version deleted]]