# [R-meta] Handling meta-analysis dataset with sampling variance equals zero

Tzlil Shushan tz|||21092 @end|ng |rom gm@||@com
Tue Jan 11 05:47:42 CET 2022

```Hi james,

To your questions and by the way of trying to make it more understandable.

We are interested in the estimates of meterage covered (group mean and SD –
expressed as meter/min) during different sided-game commonly used is
football (soccer) training. Then, we further explore moderating factors
such as game format (e.g. number of players) or any other configurations
(e.g. pitch size) or rules (e.g. scoring options) on these running outcomes.

Regarding the speed, YES, different studies usually use different speed
zones; for example:
Study 1 has 3 buckets (14.4–19.8 km.h, 19.8–24 km.h and everything >24) and
reports group's summary statistics of distance covered on each of the three
buckets. Study 2 also has 3 buckets (13–18 km.h, 18–22 km.h and everything
>22) and reports group's summary statistics of distance covered on each of
the three buckets.

First thing we did was to calculate the overall running in each starting
zone to infinite (i.e. everything >14.4, >19.8 and >24 in study 1, and
everything >13, >18 and >22 in study 2). Of note, for means we basically
added  the distance in each bucket. For aggregating SDs we either
calculated or estimated their covariances (by considering their mutual
relationship)..

For the final datasets we basically have 3 different speed zones (informed
by conceptual decisions related to our field) that we meta-analyse each of
them separately (three independent meta-analysis), let's say:
Meta 1 includes all the estimates >13 km.h and up to >16 km.h
Meta 2 includes all the estimates >18 km.h and up to >22 km.h
Meta 3 includes all the estimates >24 km.h

Note: estimates are the group mean and SD of the distance covered (i.e.
meter per min).

example dataset meta 1:

mean (m/min) SD (min/min) Speed
4 0.8 >18
3 0.5 >19.8
6 0.6 >22

For the meta-analysis includes the highest speed values (meta 3), there are
many studies reporting summary statistics of mean=0 and SD=0 (see below;
i.e. none of the distance covered was above 24 km.h).These results make
sense. For this model we get warning message for non-definite covariance in
the V-matrix and can't have heterogeneity statistics like Q-statistics and
I^2 for the model (as we have for the other two lower speeds models). We
are keen to know what would be the most reasonable solution for reporting
heterogeneity in this model.

example dataset meta 3:

mean (m/min) SD (min/min) Speed
0 0 >24
0.12 0.02 >24.8
0 0 >25

To your last question, there are many different in games formats within and
between samples, resulting in many studies reporting multiple effect sizes
for the same participants. Therefore, we use nested approach and RVE for
our models while controlling for their covariances. Later on, we conduct
meta-regression to to test the effect of these differences on running
outcomes.

I hope this makes more sense on the project in general and our question on
heterogeneity in particular..

Kind regards,

Tzlil Shushan | Sport Scientist, Physical Preparation Coach

BEd Physical Education and Exercise Science
MSc Exercise Science - High Performance Sports: Strength &
Conditioning, CSCS
PhD Candidate Human Performance Science & Sports Analytics

‫בתאריך יום ג׳, 11 בינו׳ 2022 ב-14:20 מאת ‪James Pustejovsky‬‏ <‪
jepusto using gmail.com‬‏>:‬

> Hi Tzlil,
>
> I am trying to understand better what your outcomes are and what
> like you are interested in the distribution of speeds at which a
> player runs during a game. So if you had the raw data for one player,
> you might represent it as a histogram showing the amount of distance
> traveled during a game (rescaled as distance traveled per minute of
> game play) as a function of the speed:
>
> Speed    Distance traveled (per minute of game play)
> ----------- ------------------------------------------------------------
> 26 km/h x
> 25 km/h xx
> 24 km/h x
> 23 km/h xxxx
> 22 km/h xxxx
> 21 km/h xxxxxx
> 20 km/h xxx
> 19 km/h xxxxxxxx
> 18 km/h xxxxxxxxxx
> 17 km/h xxxxxxxxxxx
> 16 km/h xxxxxxxxxxxx
> 15 km/h xxxxxxxxxxxxxxxx
> 14 km/h xxxxxxxxxxxxxxx
> [etc.]
>
> But from your explanation, it sounds like you only have summary
> statistics on this distribution, such as histograms with much coarser
> categories than what I have represented above. Is that correct? And if
> so, do different studies generally use the same set of coarse speed
> categories? Or does every study use different categories?
>
> Also, I'm not clear about how you end up with a mean and a SD for each
> of these buckets. Is the SD a summary over multiple individual
> participants/players? Or over multiple repetitions for an individual?
>
> All of the above questions are just about the outcomes you and your
> colleagues are examining. How would you summarize your research
> question? Is it about how variation in game format or game rules
> affect the distribution of running speeds? So the "intervention" or
> "treatment" of interest is a comparison of different game formats? If
> that is correct, do the game formats vary within sample or only
> between sample?
>
> James
>
>
> On Mon, Jan 10, 2022 at 7:32 PM Tzlil Shushan <tzlil21092 using gmail.com>
> wrote:
> >
> > Dear Wolfgang, James and the team..
> >
> > Me and colleagues are currently conducting a meta-analysis in the area
> of sports. In our analysis, we meta-analyse the exposure of high-speed
> running and sprinting in variations of games during football (soccer)
> training. For example, an outcome may be the distance covered (in meters)
> during a 4 versus 4 players game between 14.4 to 19.8 km.h only or the
> distance covered above 24.0 km.h. Hence, our main outcome is the mean and
> sampling variance of SD express as meters per minute game (we use "MN" in
> escalc function)..
> >
> > Considering that exposure to high velocity thresholds (e.g. >24 km.h)
> uncommonly happens during such games we have many outcomes that have mean
> and SD of 0 (we get sampling error of 0), also yielding to an overall
> estimate that is very close to zero. In other words, almost all of the
> distance covered during such games is in running speeds that are less than
> what is considered 'sprinting' in football.
> >
> > My main question is regarding heterogeneity..
> >
> > Whilst building the 'V-matrix' using the clubSandwich package we get a
> warning message of non-positive definite due to 0s in the matrix. We
> basically ignore this because it makes sense to have 0s as explained. Also,
> I know that when we run the model we don't get the Q-statistics and cannot
> calculate I^2 due to the same reason..
> >
> > I've been reading some of past discussions on these in the group however
> wanted to make sure and ask for reporting approaches. Is there any option
> to get these heterogeneity statistics with our data? Alternatively, can we
> basically state that we don't report these because of the nature of the
> dataset including many 0 values – and report tau only?
> >
> > The main thing is that we have a dataset including lower intensities
> which we obtain all aforementioned heterogeneity and we want to have a
> consistent report strategy in the paper, unless it is impossible due to the
> difference in outcomes across datasets..
> >
> > I appreciate your help here..
> >
> > Kind regards,
> >
> > Tzlil Shushan | Sport Scientist, Physical Preparation Coach
> >
> > BEd Physical Education and Exercise Science
> > MSc Exercise Science - High Performance Sports: Strength & Conditioning,
> CSCS
> > PhD Candidate Human Performance Science & Sports Analytics
> >
>

[[alternative HTML version deleted]]

```