[R-meta] Meta-analytical test of mediation model including dependent tests - looking to resolve metafor issue or find alternative approach

Thu Dec 17 10:33:48 CET 2020

Dear Lukas,

The problem with making modeling decisions based on the data is that it changes the statistical properties of estimators and corresponding sampling errors in unknown ways, so ideally we should avoid these kind of data-dependent decisions. At the same time, not modeling the heteroscedasticity when the data suggest so also doesn't seem like a good strategy.

As for the alternative approach: Let's say you are interested in the correlation between constructs A and B but a particular study measures construct A in two different ways, with measures A1 and A2. Then there are correlations in total, namely r(A1,A2), r(A1,B), and r(A2,B). With rcalc(), you can construct the corresponding 3x3 var-cov matrix of these 3 correlations (V). For the actual meta-analyis, the study would then be coded as:

study cor      mod      r
-------------------------
1     r_A1_A2  r_other  .
1     r_A1_B   r_AB     .
1     r_A2_B   r_AB     .

Actually, variable 'cor' is irrelevant here, since 'mod' is the moderator of interest here. By coding the second and third row as r_AB, the model will automatically pool the two correlations (while keeping the dependency between the three correlations into consideration via the var-cov matrix V).

Moreover, one could even consider leaving out row 1, since you are interested in the A-B correlation. So, we are then left with:

study cor      mod      r
-------------------------
1     r_A1_B   r_AB     .
1     r_A2_B   r_AB     .

and from V we just keep rows/columns 2-3. However, you need to start with all three correlations, because cov(r_A1_B, r_A2_B) depends on r_A1_A2.

Now, 'mod' is also irrelevant, since it is a constant. So, the model will just pool the correlations -- which are all reflections of the A-B correlation -- while taking the dependency in multiple estimates of this correlation from the same study into consideration.

Best,
Wolfgang

>-----Original Message-----
>From: Lukas Wallrich [mailto:l.wallrich using gold.ac.uk]
>Sent: Saturday, 12 December, 2020 10:59
>To: Viechtbauer, Wolfgang (SP)
>Cc: r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] Meta-analytical test of mediation model including
>dependent tests - looking to resolve metafor issue or find alternative
>approach
>
>Dear Wolfgang,
>
>Thank you very much for your quick and helpful response! The difference
>indeed becomes much smaller (though it does not disappear in my case) when I
>allow the heterogeneity to differ between the pairs in the joint estimation.
>Now I need to decide whether that is appropriate - if I understand Rubio-
>Aparicio et al. (2019) correctly, the decision depends on whether I expect
>heteroscedasticity based on theory, rather than on any test of the data? In
>case the data is informative, I share the full set below - from the data and
>the theory, it appears that there is much more heterogeneity in some
>correlations than others.
>
>Regarding the alternative approach of combining correlation matrices: that
>is actually where I started, but I did not understand how to deal with one
>type of dependency: measures nested into constructs. Specifically, in my
>data, some studies use two measures of the same construct, which I would
>both like to use to estimate the relevant correlations. For instance,
>affective and cognitive are both measures of attitudes, so they should
>inform those correlations rather than be estimated differently. Is there any
>way to include that into your suggested approach?
>
>Many thanks,
>
>Lukas
>
>meta_data <- tibble::tribble(
>  ~study, ~measure, ~pair, ~r, ~N, ~inv_N,
>"longit", "T1", "pos_div", 0.22, 211, 0.005,
>  "longit", "T1", "pos_neg", 0.16, 211, 0.005,
>  "longit", "T1", "neg_div", -0.02, 211, 0.005,
>  "longit", "T2", "pos_div", 0.33, 211, 0.005,
>  "longit", "T2", "pos_neg", -0.05, 211, 0.005,
>  "longit", "T2", "neg_div", -0.28, 211, 0.005,
>  "UK_mediation", "only", "neg_div", -0.3, 224, 0.004,
>  "UK_mediation", "only", "pos_div", 0.43, 224, 0.004,
>  "UK_mediation", "only", "pos_neg", -0.01, 224, 0.004,
>  "UK_mediation", "affective", "pos_att", -0.38, 224, 0.004,
>  "UK_mediation", "cognitive", "pos_att", -0.2, 224, 0.004,
>  "UK_mediation", "affective", "div_att", -0.44, 224, 0.004,
>  "UK_mediation", "cognitive", "div_att", -0.55, 224, 0.004,
>  "UK_mediation", "affective", "neg_att", 0.18, 224, 0.004,
>  "UK_mediation", "cognitive", "neg_att", 0.21, 224, 0.004,
>  "DE_mediation", "only", "pos_div", 0.35, 2618, 0,
>  "DE_mediation", "only", "neg_div", -0.16, 2618, 0,
>  "DE_mediation", "only", "div_att", -0.53, 2618, 0,
>  "DE_mediation", "only", "pos_neg", 0.25, 2618, 0,
>  "DE_mediation", "only", "pos_att", -0.43, 2618, 0,
>  "DE_mediation", "only", "neg_att", 0.26, 2618, 0,
>  "longit", "T2_prej", "pos_att", -0.222, 211, 0.005,
>  "longit", "T2_prej", "neg_att", 0.137, 211, 0.005,
>  "longit", "T2_prej", "div_att", -0.227, 211, 0.005,
>  "longit", "T1_therm", "neg_att", 0.148, 211, 0.005,
>  "longit", "T1_therm", "div_att", -0.17, 211, 0.005,
>  "longit", "T1_therm", "pos_att", -0.325, 211, 0.005,
>  "longit", "T2_therm", "pos_att", -0.356, 211, 0.005,
>  "longit", "T2_therm", "neg_att", 0.103, 211, 0.005,
>  "longit", "T2_therm", "div_att", -0.231, 211, 0.005,
>  "India", "divval_pref", "pos_div", 0.14, 152, 0.007,
>  "India", "divval_instr", "pos_div", -0.058, 152, 0.007,
>  "India", "divval_pref", "neg_div", -0.016, 152, 0.007,
>  "India", "divval_instr", "neg_div", -0.248, 152, 0.007,
>  "India", "divval_pref", "pos_neg", 0.003, 152, 0.007,
>  "India", "divval_pref", "div_att", -0.213, 152, 0.007,
>  "India", "divval_instr", "div_att", -0.208, 152, 0.007,
>  "India", "divval_pref", "pos_att", -0.563, 152, 0.007,
>  "India", "divval_pref", "neg_att", -0.016, 152, 0.007,
>  "NCS_2018", "divval_pref", "pos_neg", -0.151, 329, 0.003,
>  "NCS_2018", "divval_pref", "pos_div", 0.115, 316, 0.003,
>  "NCS_2018", "divval_pref", "neg_div", -0.08, 315, 0.003,
>  "NCS_2018", "divval_better", "pos_div", 0.037, 327, 0.003,
>  "NCS_2018", "divval_better", "neg_div", -0.006, 326, 0.003,
>  "NCS_2018", "divval_pref", "pos_att", -0.264, 319, 0.003,
>  "NCS_2018", "divval_pref", "neg_att", 0.068, 318, 0.003,
>  "NCS_2018", "divval_pref", "div_att", -0.077, 317, 0.003,
>  "NCS_2018", "divval_better", "div_att", -0.069, 320, 0.003,
>  "NCS_2019", "divval_pref", "pos_neg", -0.139, 434, 0.002,
>  "NCS_2019", "divval_pref", "pos_div", 0.14, 110, 0.009,
>  "NCS_2019", "divval_pref", "neg_div", -0.167, 107, 0.009,
>  "NCS_2019", "divval_better", "pos_div", 0.074, 106, 0.009,
>  "NCS_2019", "divval_better", "neg_div", -0.206, 103, 0.01,
>  "NCS_2019", "divval_pref", "pos_att", -0.295, 447, 0.002,
>  "NCS_2019", "divval_pref", "neg_att", 0.191, 432, 0.002,
>  "NCS_2019", "divval_pref", "div_att", 0.126, 112, 0.009,
>  "NCS_2019", "divval_better", "div_att", -0.223, 107, 0.009
>)
>
>On Fri, 11 Dec 2020 at 17:32, Viechtbauer, Wolfgang (SP)
><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>Dear Lukas,
>
>It is to be expected that the results from separate analyses will differ
>from the multilevel model. This issue, albeit in a somewhat different
>modeling context, is discussed here:
>
>https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.metafor
>-
>project.org%2Fdoku.php%2Ftips%3Acomp_two_independent_estimates&data=04%7
>C01%7Cl.wallrich%40gold.ac.uk%7Ca6dfed0b394b453d3c8008d89dfacbfa%7C0d431f3f2
>0c1461c958a46b29d4e021b%7C0%7C0%7C637433047777612288%7CUnknown%7CTWFpbGZsb3d
>8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&a
>mp;sdata=s31zFcZ7j8Cb57iUl6THo8Wt8v9A%2BjQRbXOenPHVzTY%3D&reserved=0
>
>Also, 1/N is not quite the way the sampling variances for correlation
>coefficients should be calculated, but given that the correlations are not
>so large, this is probably not going to matter that much. One can also
>debate whether one should meta-analyze raw correlation coefficients, but
>let's leave this issue aside for now.
>
>But the results don't look strange to me. It's also a rather small dataset,
>so changes in the modeling approach can lead to noticeably different
>results.
>
>I am not sure if I would agree with the general approach here to deal with
>the multilevel structure though. Let's take the first study:
>
> 1 UK_mediation affective pos_att -0.38  0.00446
> 2 UK_mediation cognitive pos_att -0.2   0.00446
> 3 UK_mediation affective neg_att  0.18  0.00446
> 4 UK_mediation cognitive neg_att  0.21  0.00446
>
>So, as far as I can tell, there are 4 variables that were measured in this
>study: affective, cognitive, pos_att, and neg_att. If so, there should be 6
>correlations in total, but you are showing only 4 of them. If one would also
>know the affective-cognitive and the pos_att-neg_att correlations, then one
>can construct the whole 6x6 var-cov matrix of the 6 correlations (or their
>r-to-z transformed values). The 'devel' version of metafor has a function
>for this called rcalc():
>
>https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwviechtb.g
>ithub.io%2Fmetafor%2Freference%2Frcalc.html&data=04%7C01%7Cl.wallrich%40
>gold.ac.uk%7Ca6dfed0b394b453d3c8008d89dfacbfa%7C0d431f3f20c1461c958a46b29d4e
>021b%7C0%7C0%7C637433047777612288%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
>AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Bwh12FvZuC
>%2B6MOSAF10Xu%2F3Pcew8zrPA%2FKXzByzp2Qc%3D&reserved=0
>
>One can then use a 'proper' multivariate model. See here for an example:
>
>https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwviechtb.g
>ithub.io%2Fmetafor%2Freference%2Fdat.craft2003.html&data=04%7C01%7Cl.wal
>lrich%40gold.ac.uk%7Ca6dfed0b394b453d3c8008d89dfacbfa%7C0d431f3f20c1461c958a
>46b29d4e021b%7C0%7C0%7C637433047777612288%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
>4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2
>B4te3clH6yllgNXgeciMOM8esQ0aocOmRcuANWAnlE4%3D&reserved=0
>
>However, with 5 studies, I might even just consider using a model with a
>properly constructed V matrix and no further random effects. There doesn't
>seem to be a huge amount of heterogeneity in these data in the first place.
>
>Best,
>Wolfgang