[R-meta] Dealing with effect size dependance with a small number of studies

Tue Jan 5 14:06:49 CET 2021

To construct an approximate 'V' matrix, something like this should work:

impute_covariance_matrix(MA_dat_raw$SV, cluster=paste0(MA_dat_raw$IDstudy, ".", MA_dat_raw$IDsubsample), r = 0.6)

I paste together the study and sample ID variables, but this is only necessary if IDsubsample is coded in such a way that the same value of IDsubsample might be used for two different studies. For example, if:

IDstudy IDsubsample IDOutcome IDeffect
1       1           outcomeA  1
1       1           outcomeA  2
1       1           outcomeB  1
1       1           outcomeB  2
1       2           outcomeA  1
1       2           outcomeA  2
1       2           outcomeB  1
1       2           outcomeB  2
2       1           outcomeA  1
2       1           outcomeB  1

then setting cluster = IDsubsample would not make sense, since it would allow for correlation across studies. On the other hand, if the last two lines were:

2       3           outcomeA  1
2       3           outcomeB  1

then using IDsubsample as the cluster variable would be fine.

Of course, the r = 0.6 is something that you need to think about.

Also, this assumes a single correlation for pairs of estimates, regardless of whether the two estimates are for the same outcome or for different outcomes. Usually, I would expect a stronger correlation for two estimates of the same outcome. Constructing a V matrix that reflects this is more tricky.

However, in the end, what you are doing here is just trying to make the 'working model' somewhat more realistic (by not assuming 0 for the correlation, which is in essence what you do when you use V=SV). The cluster-robust inference approach then takes this working model as input and computes the standard errors of the fixed effects in such a way that even if the model is misspecified, the estimated standard errors are (asymptotically) correct.

Best,
Wolfgang

>-----Original Message-----
>From: Danka Puric [mailto:djaguard using gmail.com]
>Sent: Tuesday, 05 January, 2021 13:28
>To: Viechtbauer, Wolfgang (SP)
>Cc: James Pustejovsky; R meta
>Subject: Re: [R-meta] Dealing with effect size dependance with a small number of
>studies
>
>Dear Wolfgang,
>
>message received (both times) :)
>
>I'll respond inline as well.
>
>On Tue, Jan 5, 2021 at 12:16 PM Viechtbauer, Wolfgang (SP)
><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>
>> Ok, I think I am replying to the right post now ...
>>
>> Responses again below.
>>
>> Best,
>> Wolfgang
>>
>> >-----Original Message-----
>> >From: Danka Puric [mailto:djaguard using gmail.com]
>> >Sent: Tuesday, 05 January, 2021 10:36
>> >To: Viechtbauer, Wolfgang (SP)
>> >Cc: James Pustejovsky; R meta
>> >Subject: Re: [R-meta] Dealing with effect size dependance with a small
>> >number of studies
>> >
>> >Dear James, Wolfgang,
>> >
>> >Thanks a lot for the quick and informative responses!
>> >
>> >1. I guess I made things unnecessarily complicated :) The thing is, I know
>> >that all these models are essentially the same:
>> >
>> >notationa <- rma.mv(ES_corrected, SV, random = ~ factor(IDeffect) | IDstudy,
>> >data=MA_dat_raw)
>> >notationb <- rma.mv(ES_corrected, SV, random = ~ 1 | IDstudy/IDeffect,
>> >data=MA_dat_raw)
>> >notationc <- rma.mv(ES_corrected, SV, random = ~ IDeffect | IDstudy,
>> >data=MA_dat_raw)
>> >
>> >but I read in the Konstantopoulos (2011) example that they only deal with
>> >the dependence arising from effect sizes coming from the same studies, but
>> >NOT with dependence arising from multiple ES coming from the same group of
>> >participants. I then erroneously concluded that in order to deal with this
>> >type of dependence I would need to use struct = "UN", but I understand now
>> >that's not the case.
>>
>> Yes, the 'struct' part is a different issue.
>>
>> If you directly want to account for dependency due to multiple effect size
>estimates coming from the same group of subjects, you would need to calculate
>the covariance between the sampling errors and include this in the 'V' matrix
>(the second argument in rma.mv(), to which you are passing 'SV'). In addition,
>we then also want to use a model like the one above to account for possible
>dependency in the underlying true effects. That is in fact how things are done
>in the Berkey example (and since 'outcome' is meaningful there, we can use
>struct="UN" to have an estimate of tau^2 for the two different outcomes).
>
>Thanks for the additional clarification! We will try to create "V"
>using the impute_covariance_matrix function James suggested and then
>use it instead of SV in our syntax:
>model <- rma.mv(ES_corrected, V, random =  ~ 1 | IDstudy / IDsubsample
>/ IDeffect, data=MA_dat_raw)
>
>> >Also, indeed, IDeffect does not refer to the type of outcome in a study.
>> >Actually, we do have an outcome variable DV which could be used instead of
>> >IDeffect, but sometimes it has the same value for several ESs in the same
>> >group of participants, so it didn't seem appropriate to use it in this case.
>> >I did realize the model with IDeffect was not structured like Berkey at al.
>> >but thought it would be a better option, as IDeffect variables have unique
>> >values across IDstudy.
>> >
>> >As for sigma1.1 and sigma2.1. it's quite possible I just got something mixed
>> >up when I compared different notations (I may have plotted the wrong model),
>> >but anyway, this model is definitely wrong for the data, so I'll just leave
>> >it at that.
>>
>> Agreed.
>>
>> >2. So, just to be sure I got this right, the following model
>> >
>> >model <- rma.mv(ES_corrected, SV, random =  ~ 1 | IDstudy / IDsubsample/
>> >IDeffect, data=MA_dat_raw)
>> >
>> >in combination with clubSandwich robust estimates will yield adequate effect
>> >size estimates for the situation where the same group of participants
>> >provided more than one ES? That's actually the model I fit first, but then
>> >thought wasn't appropriate after all.
>>
>> Yes, this looks like a sensible approach. In principle, since you mentioned it
>above, you could even consider:
>>
>> random =  ~ 1 | IDstudy / IDsubsample / IDOutcome / IDeffect
>>
>> where IDOutcome is the id variable for the the different outcomes. This model
>is in principle possible, since you mentioned that sometimes, within a
>particular subsample, there are multiple effects for the same outcome (if this
>were not the case, then IDOutcome and IDeffect would not be uniquely
>identifiable). However, this may be pushing things a bit with k=69 estimates. In
>essence, this is a five-level model, so two levels more than the 'three-level
>model' described by Konstantopoulos (2011):
>>
>> http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011
>>
>> (in multilevel model parlance, the standard random-effects model would be
>considered a two-level model, so the number of levels is 1 + the number of
>hierarchical levels you are adding via 'random').
>
>Yes, adding an IDoutcome variable would actually be a great way to
>approach the issue of same/different outcomes coming from the same
>subsample. But I also agree that such a model might be too complex for
>the data and I would expect both IDoutcome and IDeffect to have
>variances very close to zero. Still, we can give it a try and see if
>it makes sense.
>
>> >I will also now look into inputting the covariance matrices and see if it's
>> >possible to implement with the data we have. Thanks for suggesting this,
>> >James.
>>
>> In this case, covariances in the sampling errors will occur only within
>subsamples. So, the 'V' matrix will be block-diagonal with blocks corresponding
>to the subsamples. For example, suppose we have
>>
>> IDstudy IDsubsample IDOutcome IDeffect
>> 1       1           outcomeA  1
>> 1       1           outcomeA  2
>> 1       1           outcomeB  1
>> 1       1           outcomeB  2
>> 1       2           outcomeA  1
>> 1       2           outcomeA  2
>> 1       2           outcomeB  1
>> 1       2           outcomeB  2
>>
>> so a study with two groups and in each group both outcomes (A and B) were
>measured in two different ways (e.g., using two different scales), leading to
>two different effects. Then the V matrix for this study would be an 8x8 matrix
>that is composed of two 4x4 blocks.
>
>Again, thank you for clarifying this. I assumed this was how it
>worked, I'm just still not sure how to create V from the data we have,
>but I'll get onto that straight away!
>
>Thanks so much!
>Danka