# [R-meta] Construct the covariance-matrices of different effect sizes

Tzlil Shushan tz|||21092 @end|ng |rom gm@||@com
Wed Jan 20 05:02:02 CET 2021

```Hi James,

Thank you so much for the response and further explanations around the
merits associated with V-matrices in correlated datasets. I think that both
you and Wolfgang have already provided a thorough insight on this in
previous threads in the mailing list and papers as this recent one
https://osf.io/x8yre/
However, I struggle to be confident with the formulas I use for each effect
index, and since my math background isn't wide, I found it hard to extract
them from papers and I'll be super grateful to get your help here :)

Since I don't have enough information on the correlation between outcomes,
I'm using 'guesstimate' values and back it up with sensitivity analysis (at
first glance, with my current formulas for V-matrices, values range from 0
to 0.9 correlation did not yield changes in the results).

(1) I might start with my Pearson correlation and ICC, because they are
probably the tricky ones, and up to now I don't really have something to
support on besides impute_covariance_matrix formula. For both datasets I
used r to Fisher's z transformation to approximate normal distributed
datasets.

ICC is simply the ICC between test-retest (submaximal heart rate values).
ICCs from the same sample can be derived from the number of test-retests
(i.e. more than one paired tests), different intensities, different test
duration etc., which I investigate  separately as meta-regression. However,
I'm first interested in the average effect.

Pearson correlations are derived from the association between heart rate at
submaximal intensity(ies) and criterion measure of aerobic fitness (let's
say max test). In theory, we expect an inverse relationship–lower heart
rate at submaximal intensity associated with higher result in max test.
Similarly to ICC, these effects are obtained from different (submaximal)
intensities, time-points across the season etc. I want to extract the
average effect, and then continue to moderator analyses.

## these are datasets examples

es.id study.id  n  icc     yi     vi
1     1         25 0.90 1.4722 0.0455
2     1         25 0.84 1.2212 0.0455
3     2         16 0.72 0.9076 0.0769
4     2         16 0.85 1.2562 0.0769
5     2         16 0.83 1.1881 0.0769
6     3         38 0.92 1.5890 0.0286

es.id study.id  n     r      yi           vi
1           1       25 -0.61 -0.7089 0.0455
2           1       25 -0.58 -0.6625 0.0455
3           2       16 -0.58 -0.6625 0.0769
4           2       16 -0.60 -0.6931 0.0769
5           2       16 -0.56 -0.6328 0.0769
6           3       38 -0.55 -0.6184 0.0286

I tried to implement rcalc() with my dataset, but the thing is that I have
two variables that are constant (submax and submax tests for ICC and submax
and max tests for *r*) that were examined repeatedly within samples, while
the formula requires different paired variables within sample/studies as
the example here https://wviechtb.github.io/metafor/reference/rcalc.html Is
it right? I'm quite lost here!

(2) As you mentioned, I construct the correlation between raw mean
difference using impute_covariance_matrix(). However, as I understand, it
is more useful to use with standardised mean difference effects? The same
is true for raw mean difference?

(3) Yes, my effect size index for SDs is the the log transformed SD (plus
bias correction for sample size) using the method proposed by Nakagawa et.al
2015
https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.12309
("SDLN"
in escal function) whereby the sampling variance is constructed solely
based on sample size 1/2(n-1). To construct the V-matrix I used Wolfgang
response in the mailing list regarding a similar question (see below)
https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2019-July/001630.html
Is this the right (or a fair) way to approach this? Does the Delta method
work better here? If so, can you refer me to something that I can work with.

## this is a small example of my dataset

es.id study.id  n   SD     yi     vi
1         1         25 1.45 0.3924 0.0208
2         1         25 1.41 0.3644 0.0208
3         2         16 1.05 0.0821 0.0333
4         2         16 1.20 0.2157 0.0333
5         2         16 1.27 0.2724 0.0333
6         3          38 1.24 0.2286 0.0135

I'll be grateful to get any insight from you and others on the mailing
list..

Thanks and Kind regards,

Tzlil Shushan | Sport Scientist, Physical Preparation Coach

BEd Physical Education and Exercise Science
MSc Exercise Science - High Performance Sports: Strength &
Conditioning, CSCS
PhD Candidate Human Performance Science & Sports Analytics

‫בתאריך יום ד׳, 20 בינו׳ 2021 ב-1:53 מאת ‪James Pustejovsky‬‏ <‪
jepusto using gmail.com‬‏>:‬

> Hi Tzlil,
>
> Apologies for the long delay in responding to your query. You've raised
> some excellent questions about rather subtle issues. To your question (1),
> I would say that it is NOT compulsory to use a sampling variance-covariance
> matrix (the "V-matrix") for the effect sizes of each type. Omitting the
> V-matrix amounts to assuming a correlation of zero. If your goal is
> primarily to understand average effect sizes (of each type), then using
> robust standard errors/hypothesis tests/confidence intervals will work even
> if you have a mis-specified assumption about the correlation among effect
> sizes.
>
> That said, there are at least two potential benefits to using a V-matrix
> based on more plausible assumptions about the correlation between effect
> sizes. First, using a working model that is closer to the true dependence
> structure will yield a more precise estimate of the average effect. If
> you're just estimating an average effect of each type, the gain in
> precision will probably be pretty small. If you're estimating a
> meta-regression with a more complex set of predictors, the gains can be
> more substantial.
>
> The second potential benefit is that using a plausible V-matrix will give
> you better, more defensible estimates of the variance components
> (between-study variance and within-study variance). Whether based on REML
> or some other estimation method, the variance component estimates are NOT
> robust to mistaken assumptions about the sampling correlation structure.
> They'll be biased unless you have the sampling correlation structure
> approximately correct. So to the extent that understanding heterogeneity is
> important, I think it's worth working on building in a V-matrix.
>
> To your question (2), I like the approach you've outlined, where you use
> different V-matrices for each of the effect indices you're looking at. I
> think ideally, you would start by making a single assumption about the
> degree of correlation between the *outcomes*, and then using that to
> derive the appropriate degree of correlation between each of the indices:
> * For raw mean differences, the correlation between outcomes will
> translate directly into the correlation between mean differences.
> * For SDs, I'm not sure exactly what your ES index is. Is it the log ratio
> of SDs? How did you arrive at the formula for the correlation between
> effect sizes? I don't know of a source for this, but it could be derived
> via the delta method.
> * For ICCs and pearson correlations, using Wolfgang's function would be
> the way to go. Perhaps if you can provide a small example of your data and
> syntax that you've attempted, folks on the list can provide guidance about
> applying the function.
>
> Kind Regards,
> James
>
> On Thu, Jan 7, 2021 at 5:16 PM Tzlil Shushan <tzlil21092 using gmail.com> wrote:
>
>> Dear Wolfgang and James,
>>
>> Apologise for the long assay in advance..
>>
>> In my meta-analysis I obtained different effect sizes coming
>> from test-retest and correlational designs. Accordingly, I performed 4
>> different meta-analyses for each effect size:
>>
>> Raw mean difference of test-retest
>> Standard deviation (using Nakagawa et al. 2015 approach) of test-retest
>> Intraclass correlation (transformed to z fisher values) of test-retest
>> Pearson correlation coefficient (transformed to z fisher values) derived
>> from the same test against criterion measure.
>>
>> Because many studies meeting inclusion criteria provided more than one
>> effect size through various ways of repeated measures (for example,
>> multiple intensities of the test, repeated measures across the year), which
>> all based on a common sample of participants, I treated each unique sample
>> as an independent study (NOTE: this approach serves our purposes on the
>> best way and adding further level will results in low number of
>> clusters–which I don't want, given the use of RVE).
>>
>> Thanks to the great discussions in this group, we've done the following:
>> (1) used rma.mv() to estimate the overall average estimate and the
>> variance in hierarchical working model. The same for meta-regressions we
>> performed.
>>
>> (2) Compute robust variance estimates with robust() and coef_test()
>> functions, clustering at the level of studies (the same is true for both
>> overall models and meta-regression).
>>
>> However, after reading some threads in the groups in the last weeks
>>
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-January/002565.html
>>
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2018-February/000647.html
>> and more...I think that one step further is to
>> provide variance-covariance matrices for each meta-analysis before step 1
>> and 2 noted above.
>>
>> In this regard I have some other questions:
>>
>> (1) Is it compulsory to create (an estimate) variance-covariance given
>> the structure of my dataset?
>>
>> (2) IF YES, I'm not sure if I can use the same covariance formulas for
>> all effect sizes. For example, impute_covariance_matrix() from clubSandwich
>> can work fine with all effect sizes (mean diff, SD, icc etc.)? or should I
>> estimate the covariance-matrix with a unique function for each effect size?
>>
>> Based on reading and suggestions:
>> • I used impute_covariance_matrix() for mean difference.
>>
>> • For standard deviation I constructed the formula below:
>> calc.v <- function(x) {
>> v <- matrix(r^2/(2*x\$ni[1]-1), nrow=nrow(x), ncol=nrow(x))
>> diag(v) <- x\$vi
>> v
>> }
>> V <- bldiag(lapply(split(dat, dat\$study), calc.v))
>> http://www.metafor-project.org/doku.php/analyses:gleser2009
>>
>> • for icc and pearson correlation I've looked at this
>> https://wviechtb.github.io/metafor/reference/rcalc.html but I couldn't
>> create something which is appropriate to my dataset (I don't really know
>> how to specify var1 and var2).
>>
>> With this regard, I created a sensitivity analysis (with 0.3, 0.5, 0.7
>> and 0.9) which revealed similar overall estimates (also similar to the
>> working models without covariance-matrix), albeit, changed a bit the
>> magnitude of sigma2.1 and sigma2.2
>>
>> I'll be thankful to get any thoughts..
>>
>> Kind regards and thanks in advance!
>>
>> Tzlil
>>
>> Tzlil Shushan | Sport Scientist, Physical Preparation Coach
>>
>> BEd Physical Education and Exercise Science
>> MSc Exercise Science - High Performance Sports: Strength &
>> Conditioning, CSCS
>> PhD Candidate Human Performance Science & Sports Analytics
>>
>>

[[alternative HTML version deleted]]

```