[R-sig-ME] Modelling football matches
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Sat Dec 17 01:26:13 CET 2022
On 2022-12-15 3:02 p.m., Jorge Teixeira wrote:
> Thank you, Ben.
>
> Yes, indeed there are many more things that could be added - I was
> trying to discuss a more fundamental structure.
>
> Game_part was related to the fact that each game has part 1 and part 2.
>
> 1) I agree it makes sense to have game_part as fixed effect too, with
> this result
>
> lmer(distance ~ stage + *game_part* + (1|player) + (1|game/game_part),
> data=my_data)
>
> 2) As for the random slopes, my question was that I believe the
> variation by game and game_part might be different across players. Can
> random slopes account for that?
That's a little challenging with 'typical' mixed model machinery.
Models where both the mean (location) and variance (scale) vary
according to covariates or groups are called 'location-scale' models.
There is a category in the mixed models task view
<https://cran.r-project.org/web/views/MixedModels.html> that covers
this, but I'm not sure whether the scale is allowed to vary as a *random
effect* -- it certainly isn't in glmmTMB.
>
> 3) For outcomes such as relative average heart rate, that are bounded by
> 100%, do you recommend a specific family of models?
Provided it doesn't go to exactly 0 or 100%, beta is the natural choice.
>
> Thanks once again.
>
> Date: Thu, 15 Dec 2022 12:12:16 -0500
> From: Ben Bolker <bbolker using gmail.com <mailto:bbolker using gmail.com>>
> To: r-sig-mixed-models using r-project.org
> <mailto:r-sig-mixed-models using r-project.org>
> Subject: Re: [R-sig-ME] Modelling football matches
> Message-ID: <ca745d94-ac01-3422-cd66-0d85058d8936 using gmail.com
> <mailto:ca745d94-ac01-3422-cd66-0d85058d8936 using gmail.com>>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
>
> For a positive-valued variable like distance you might want to
> consider a log-linear model (lmer(log(distance) ~ ...) or a Gamma GLMM
> (glmer(distance ~ ..., family = Gamma(link="log"))
>
> I believe the full model here would use random slopes ('slopes' in
> the broad sense since stage is a categorical variable) of stage
> (stage|player) - (stage|game) won't work because each game is only one
> stage.
>
> I'm not sure about the definition of 'game_part', but you might want
> to add a *fixed* effect of game_part as well as the 'game_part within
> game' nested random effect.
>
> There's probably a huge amount of covariate information you could add
> (e.g. player's position, player's age), probably other stuff too (random
> effect of team?)
>
> Jorge Teixeira <jorgemmtteixeira using gmail.com
> <mailto:jorgemmtteixeira using gmail.com>> escreveu no dia quinta, 15/12/2022
> à(s) 15:17:
>
> Hi.
>
> 1) Assuming that most are somewhat familiar with football, and that
> it is world cup time, what do you think of this model to compare
> differences in distance covered between stages (group stage vs final
> stage)?
>
> lmer(distance ~ stage + (1|player) + (1|game/game_part), data=my_data)
>
> 2) In theory, which random slopes do you think should be added, if any?
>
> Thank you.
>
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
More information about the R-sig-mixed-models
mailing list