[R-meta] Meta-analysis approach for physical qualities benchmarks
Tzlil Shushan
tz|||21092 @end|ng |rom gm@||@com
Wed Jul 3 07:14:41 CEST 2024
Hey James,
Thanks so much for the detailed response.
From reviewing the papers you referred me to for further reading and
several examples from the metafor mvmeta packages using bivariate
approaches, it appears that these approaches are often derived from similar
effect sizes (e.g., log odds ratios nested within intervention groups,
intervention and control). Considering that my dataset includes means and
SDs, which are two distinct effect sizes, would it be possible to
meta-analyse separately within a single multivariate multilevel model?
Indeed, I've been trying a few options (might be silly ones), such as
merging yi and vi from data_means and data_sd and adding measure (i.e.,
means or sd) as a moderator in the model. However, this approach does not
seem appropriate when considering the variance components at different
levels (i.e., sigma). I also could not specify the covariance matrix of the
two datasets because the two have different sampling variance.
Considering that the meta-analysis of SDs is conducted directly on the
sampling variance of the means, might this justify the use of separate
models? Interestingly, I have seen a few studies analysing means and SDs
separately. They discussed the pooled SD as between-individual SD, but none
of them used the two to build benchmarks (for example, z-scores).
To provide further explanation of how my dataset looks, below is an example
with the first 5 studies of one of the physical performance tests (20m
sprint time). I did not include other columns which will be used as
moderators in the following models.
structure(list(Study.id = c("#4587", "#4587", "#11750", "#5320",
"#5320", "#5320", "#5320", "#10188", "#10188", "#10188", "#10188",
"#10188", "#10188", "#13817"), Group.id = c(2, 2, 3, 4, 5, 6,
7, 7, 8, 9, 10, 11, 12, 18), es.id = 1:14, n = c(16, 16, 23,
11, 11, 9, 6, 11, 13, 15, 10, 14, 12, 18), final.mean = c(3.39,
3.36, 3.52, 3.2, 3.3, 3.15, 3.41, 3.75, 3.68, 3.69, 3.71, 3.68,
3.64, 3.57), final.sd = c(0.21, 0.2, 0.18, 0.12, 0.16, 0.09,
0.17, 0.09, 0.1, 0.08, 0.06, 0.08, 0.1, 0.17), yi_mean = c(3.39,
3.36, 3.52, 3.2, 3.3, 3.15, 3.41, 3.75, 3.68, 3.69, 3.71, 3.68,
3.64, 3.57), vi_mean = c(0.003, 0.003, 0.001, 0.001, 0.002, 0.001,
0.005, 0.001, 0.001, 0, 0, 0, 0.001, 0.002), yi_sd = c(-1.527,
-1.576, -1.692, -2.07, -1.783, -2.345, -1.672, -2.358, -2.261,
-2.49, -2.758, -2.487, -2.257, -1.743), vi_sd = c(0.033, 0.033,
0.023, 0.05, 0.05, 0.062, 0.1, 0.05, 0.042, 0.036, 0.056, 0.038,
0.045, 0.029)), digits = c(est = 4, se = 4, test = 4, pval = 4,
ci = 4, var = 4, sevar = 4, fit = 4, het = 4), row.names = c(NA,
14L), class = c("escalc", "data.frame"))
Thanks for looking at the entire code and noticing that I've used the study
level for computing the covariance matrix. While my dataset includes adult
female soccer players only, many studies provided data for subgroups. I had
no idea I could use groups for creating the covariance matrix and computing
robust standard errors using study clusters.
Would be great to hear back from you with some further thoughts.
Best regards,
Tzlil Shushan | Sport Scientist, Physical Preparation Coach
BEd Physical Education and Exercise Science
MSc Exercise Science - High Performance Sports: Strength &
Conditioning, CSCS
PhD Human Performance Science & Sports Analytics
בתאריך יום ד׳, 3 ביולי 2024 ב-4:00 מאת James Pustejovsky <
jepusto using gmail.com>:
> Hi Tzlil,
>
> From my perspective, your approach seems reasonable as a starting point
> for characterizing the distribution of each of these quantities, but I
> would be cautious about trying to create benchmarks based on the results of
> two separate models. It seems like the benchmarks would be a non-linear
> function of both the Ms and the SDs. Evaluating a non-linear function at
> average values of the inputs does not produce the same result as evaluating
> the average of a non-linear function of individual inputs, and it can be
> poor even as an approximation. I would think that it would be preferable to
> work towards a joint model for the Ms and SDs---treating them as two
> dimensions of a bivariate effect size measure. I think this would be
> feasible using multivariate meta-analysis models, for which the metafor
> documentation provides extensive documentation. See also Gasparrini and
> Armstrong (2011; https://doi.org/10.1002/sim.4226) and Sera et al. (2019;
> https://doi.org/10.1002/sim.8362).
>
> A further reason to consider a joint (multivariate) model is that for many
> distributions other than the Gaussian, mean parameters and variance
> parameters tend to be related. For instance, count data distributions
> typically have variances that grow larger as the mean grows larger. If the
> physical quantities that you are modeling follow such distributions, then
> capturing the interrelationship between the M and SD could be important
> both for purposes of obtaining precise summary estimates and for the
> interpretation of the results.
>
> One other small note about your code: for purposes of creating a sampling
> variance covariance matrix, it makes sense to impute covariances between
> effect size estimates that are based on the same sample (or at least
> partially overlapping samples). I see from your rma.mv code that you have
> random effects for effect sizes nested in groups nested in studies. If the
> groups within a study are independent (e.g., separate samples of male and
> female athletes), then the effect sizes from different groups should
> probably be treated as independent. In this case, your call to
> impute_covariance_matrix() should cluster by Group.id instead of by
> Study.id. But for purposes of computing robust standard errors, you would
> still use cluster = Study.id.
>
> James
>
> On Sun, Jun 30, 2024 at 7:31 PM Tzlil Shushan via R-sig-meta-analysis <
> r-sig-meta-analysis using r-project.org> wrote:
>
>> Dear Wolfgang and R-sig-meta-analysis community,
>>
>> I would like to see if I can pick your thoughts about an approach I am
>> using in my current meta-analysis research.
>>
>> We are conducting a meta-analysis on a range of physical qualities. The
>> primary objective of these meta-analyses is to create benchmarks for
>> previous and future observations.
>>
>> For example, one of the physical qualities includes sprint times from
>> discrete distances (5m to 40m). We have gathered descriptive data (means
>> and standard deviations) from approximately 250 studies.
>>
>> We aim to provide practitioners in the field with tools to compare the
>> results of their athletes to this benchmarking meta-analysis. Therefore,
>> we
>> want to include commonly used tools in our field, such as z-scores and
>> percentiles, to facilitate these comparisons, alongside measures of
>> uncertainty using CIs and PIs.
>>
>> Given that these approaches require the sample/population standard
>> deviations, I have conducted separate multilevel mixed-effects
>> meta-analyses for means and standard deviations.
>>
>> Below is an example of the approach I am considering:
>>
>> ############
>> Meta-analysis of means:
>>
>> data_means <- escalc(measure = "MN",
>> mi = Final.Outcome,
>> sdi = Final.SD,
>> ni = Sample.Size,
>> data = data)
>>
>> V <- impute_covariance_matrix(vi = data_means$vi,
>>
>> cluster = data_means$Study.id,
>>
>> r = .7,
>>
>> smooth_vi = T)
>>
>>
>> rma_means_model <- rma_means_model <- rma.mv(yi,
>>
>> V_means,
>> random = list(~ 1 | Study.id/Group.id/ES.id),
>> digits = 2,
>> data = data_means,
>> method = "REML",
>> test = "t",
>> control=list(optimizer="optim",
>> optmethod="Nelder-Mead"))
>>
>> robust_means_model <- robust.rma.mv(rma_means_model,
>> cluster = data_means$Study.id
>> adjust = T,
>> clubSandwich = T)
>>
>>
>> est_robust_means_model <- predict.rma(robust_means_model, digits = 2,
>> level
>> = .9)
>>
>>
>> ############
>> Meta-analysis of SDs:
>>
>> data_sd <- escalc(measure = "SDLN",
>> sdi = Final.SD,
>> ni = Sample.Size,
>> data = data)
>>
>> V <- impute_covariance_matrix(vi = data_sd$vi,
>>
>> cluster = data_sd$Study.id,
>>
>> r = .7,
>>
>> smooth_vi = T)
>>
>>
>> rma_sd_model <- rma.mv(yi,
>> V_sd,
>> random = list(~ 1 | Study.id./Group.id/ES.id),
>> digits = 2,
>> data = data_sd,
>> method = "REML",
>> test = "t",
>> control=list(optimizer="optim",
>> optmethod="Nelder-Mead"))
>>
>> robust_sd_model <- robust.rma.mv(rma_sd_model,
>> cluster = data_sd$Study.id,
>> adjust = T,
>> clubSandwich = T)
>>
>>
>> est_robust_sd_model <- predict.rma(robust_sd_model, digits = 2, transf =
>> transf.exp.int, level = .9)
>>
>> I would greatly appreciate your thoughts/feedback on whether this approach
>> is statistically sound. Specifically, is it appropriate to conduct
>> separate
>> meta-analyses for means and SDs and then use the pooled estimates for
>> creating benchmarks? Are there any potential pitfalls or alternative
>> methods you would recommend?
>>
>> Tzlil Shushan | Sport Scientist, Physical Preparation Coach
>>
>> BEd Physical Education and Exercise Science
>> MSc Exercise Science - High Performance Sports: Strength &
>> Conditioning, CSCS
>> PhD Human Performance Science & Sports Analytics
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>> To manage your subscription to this mailing list, go to:
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list