[R-meta] Handling meta-analysis dataset with sampling variance equals zero

Fri Jan 14 06:02:39 CET 2022

Hey Michael,

Thanks for your answer and suggestion..

Unfortunately, a binary analysis does not fit our research question since
we are not interested if a running speed threshold was achieved, but more
about how much distance was covered in the speed..

Based on your answer I assume that a reasonable option for us is to report
only tau in the models including many 0s?

Kind regards,

Tzlil Shushan | Sport Scientist, Physical Preparation Coach

BEd Physical Education and Exercise Science
MSc Exercise Science - High Performance Sports: Strength &
Conditioning, CSCS
PhD Candidate Human Performance Science & Sports Analytics

‫בתאריך יום ד׳, 12 בינו׳ 2022 ב-0:47 מאת ‪Michael Dewey‬‏ <‪
lists using dewey.myzen.co.uk‬‏>:‬

> Dear Tzlil
>
> I can understand the desire to make everything consistent across
> outcomes but that does seem to make life unnecessarily complicated for
> you. Would it be acceptable for the third analysis to recode the outcome
> as binary (in this sort of game nobody sprints versus somebody sprints)
> or would that deviate too much from your scientific question?
>
> Michael
>
> On 11/01/2022 04:47, Tzlil Shushan wrote:
> > Hi james,
> >
> > Thank you for the reply..
> >
> > To your questions and by the way of trying to make it more
> understandable.
> >
> > We are interested in the estimates of meterage covered (group mean and
> SD –
> > expressed as meter/min) during different sided-game commonly used is
> > football (soccer) training. Then, we further explore moderating factors
> > such as game format (e.g. number of players) or any other configurations
> > (e.g. pitch size) or rules (e.g. scoring options) on these running
> outcomes.
> >
> > Regarding the speed, YES, different studies usually use different speed
> > zones; for example:
> > Study 1 has 3 buckets (14.4–19.8 km.h, 19.8–24 km.h and everything >24)
> and
> > reports group's summary statistics of distance covered on each of the
> three
> > buckets. Study 2 also has 3 buckets (13–18 km.h, 18–22 km.h and
> everything
> >> 22) and reports group's summary statistics of distance covered on each
> of
> > the three buckets.
> >
> > First thing we did was to calculate the overall running in each starting
> > zone to infinite (i.e. everything >14.4, >19.8 and >24 in study 1, and
> > everything >13, >18 and >22 in study 2). Of note, for means we basically
> > added  the distance in each bucket. For aggregating SDs we either
> > calculated or estimated their covariances (by considering their mutual
> > relationship)..
> >
> > For the final datasets we basically have 3 different speed zones
> (informed
> > by conceptual decisions related to our field) that we meta-analyse each
> of
> > them separately (three independent meta-analysis), let's say:
> > Meta 1 includes all the estimates >13 km.h and up to >16 km.h
> > Meta 2 includes all the estimates >18 km.h and up to >22 km.h
> > Meta 3 includes all the estimates >24 km.h
> >
> > Note: estimates are the group mean and SD of the distance covered (i.e.
> > meter per min).
> >
> > example dataset meta 1:
> >
> > mean (m/min) SD (min/min) Speed
> > 4 0.8 >18
> > 3 0.5 >19.8
> > 6 0.6 >22
> >
> >
> > For the meta-analysis includes the highest speed values (meta 3), there
> are
> > many studies reporting summary statistics of mean=0 and SD=0 (see below;
> > i.e. none of the distance covered was above 24 km.h).These results make
> > sense. For this model we get warning message for non-definite covariance
> in
> > the V-matrix and can't have heterogeneity statistics like Q-statistics
> and
> > I^2 for the model (as we have for the other two lower speeds models). We
> > are keen to know what would be the most reasonable solution for reporting
> > heterogeneity in this model.
> >
> > example dataset meta 3:
> >
> > mean (m/min) SD (min/min) Speed
> > 0 0 >24
> > 0.12 0.02 >24.8
> > 0 0 >25
> >
> > To your last question, there are many different in games formats within
> and
> > between samples, resulting in many studies reporting multiple effect
> sizes
> > for the same participants. Therefore, we use nested approach and RVE for
> > our models while controlling for their covariances. Later on, we conduct
> > meta-regression to to test the effect of these differences on running
> > outcomes.
> >
> > I hope this makes more sense on the project in general and our question
> on
> > heterogeneity in particular..
> >
> > Kind regards,
> >
> > Tzlil Shushan | Sport Scientist, Physical Preparation Coach
> >
> > BEd Physical Education and Exercise Science
> > MSc Exercise Science - High Performance Sports: Strength &
> > Conditioning, CSCS
> > PhD Candidate Human Performance Science & Sports Analytics
> >
> >
> >
> > ‫בתאריך יום ג׳, 11 בינו׳ 2022 ב-14:20 מאת ‪James Pustejovsky‬‏ <‪
> > jepusto using gmail.com‬‏>:‬
> >
> >> Hi Tzlil,
> >>
> >> I am trying to understand better what your outcomes are and what
> >> questions you're trying to answer. From your explanation, it sounds
> >> like you are interested in the distribution of speeds at which a
> >> player runs during a game. So if you had the raw data for one player,
> >> you might represent it as a histogram showing the amount of distance
> >> traveled during a game (rescaled as distance traveled per minute of
> >> game play) as a function of the speed:
> >>
> >> Speed    Distance traveled (per minute of game play)
> >> ----------- ------------------------------------------------------------
> >> 26 km/h x
> >> 25 km/h xx
> >> 24 km/h x
> >> 23 km/h xxxx
> >> 22 km/h xxxx
> >> 21 km/h xxxxxx
> >> 20 km/h xxx
> >> 19 km/h xxxxxxxx
> >> 18 km/h xxxxxxxxxx
> >> 17 km/h xxxxxxxxxxx
> >> 16 km/h xxxxxxxxxxxx
> >> 15 km/h xxxxxxxxxxxxxxxx
> >> 14 km/h xxxxxxxxxxxxxxx
> >> [etc.]
> >>
> >> But from your explanation, it sounds like you only have summary
> >> statistics on this distribution, such as histograms with much coarser
> >> categories than what I have represented above. Is that correct? And if
> >> so, do different studies generally use the same set of coarse speed
> >> categories? Or does every study use different categories?
> >>
> >> Also, I'm not clear about how you end up with a mean and a SD for each
> >> of these buckets. Is the SD a summary over multiple individual
> >> participants/players? Or over multiple repetitions for an individual?
> >>
> >> All of the above questions are just about the outcomes you and your
> >> colleagues are examining. How would you summarize your research
> >> question? Is it about how variation in game format or game rules
> >> affect the distribution of running speeds? So the "intervention" or
> >> "treatment" of interest is a comparison of different game formats? If
> >> that is correct, do the game formats vary within sample or only
> >> between sample?
> >>
> >> James
> >>
> >>
> >> On Mon, Jan 10, 2022 at 7:32 PM Tzlil Shushan <tzlil21092 using gmail.com>
> >> wrote:
> >>>
> >>> Dear Wolfgang, James and the team..
> >>>
> >>> Me and colleagues are currently conducting a meta-analysis in the area
> >> of sports. In our analysis, we meta-analyse the exposure of high-speed
> >> running and sprinting in variations of games during football (soccer)
> >> training. For example, an outcome may be the distance covered (in
> meters)
> >> during a 4 versus 4 players game between 14.4 to 19.8 km.h only or the
> >> distance covered above 24.0 km.h. Hence, our main outcome is the mean
> and
> >> sampling variance of SD express as meters per minute game (we use "MN"
> in
> >> escalc function)..
> >>>
> >>> Considering that exposure to high velocity thresholds (e.g. >24 km.h)
> >> uncommonly happens during such games we have many outcomes that have
> mean
> >> and SD of 0 (we get sampling error of 0), also yielding to an overall
> >> estimate that is very close to zero. In other words, almost all of the
> >> distance covered during such games is in running speeds that are less
> than
> >> what is considered 'sprinting' in football.
> >>>
> >>> My main question is regarding heterogeneity..
> >>>
> >>> Whilst building the 'V-matrix' using the clubSandwich package we get a
> >> warning message of non-positive definite due to 0s in the matrix. We
> >> basically ignore this because it makes sense to have 0s as explained.
> Also,
> >> I know that when we run the model we don't get the Q-statistics and
> cannot
> >> calculate I^2 due to the same reason..
> >>>
> >>> I've been reading some of past discussions on these in the group
> however
> >> wanted to make sure and ask for reporting approaches. Is there any
> option
> >> to get these heterogeneity statistics with our data? Alternatively, can
> we
> >> basically state that we don't report these because of the nature of the
> >> dataset including many 0 values – and report tau only?
> >>>
> >>> The main thing is that we have a dataset including lower intensities
> >> which we obtain all aforementioned heterogeneity and we want to have a
> >> consistent report strategy in the paper, unless it is impossible due to
> the
> >> difference in outcomes across datasets..
> >>>
> >>> I appreciate your help here..
> >>>
> >>> Kind regards,
> >>>
> >>> Tzlil Shushan | Sport Scientist, Physical Preparation Coach
> >>>
> >>> BEd Physical Education and Exercise Science
> >>> MSc Exercise Science - High Performance Sports: Strength &
> Conditioning,
> >> CSCS
> >>> PhD Candidate Human Performance Science & Sports Analytics
> >>>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-meta-analysis mailing list
> > R-sig-meta-analysis using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>

	[[alternative HTML version deleted]]