[R-meta] Dealing with effect size dependance with a small number of studies
Michael Dewey
||@t@ @end|ng |rom dewey@myzen@co@uk
Wed Jan 6 12:00:31 CET 2021
Dear Danka
This may have already been suggested somewhere but trying a range of
plausible values and seeing if it makes a difference is never a bad
idea. Sometimes the world is kind to you.
Michael
On 06/01/2021 09:00, Danka Puric wrote:
> Dear Wolfgang,
>
> To be honest, I wasn't even expecting a reply to my last email, and
> then I got a ready-to-go syntax with a clear explanation, so thank
> you, this is exactly what we needed! Our IDsubsample variable has
> unique values across the whole dataset, so I ended up using only that.
> I checked the resulting matrix and it looks as it should, i.e.
> block-diagonal.
>
> And yes, we'll have to figure out which exact value of the correlation
> to assume. For example, in this function:
> https://rdrr.io/cran/MAd/man/agg.html by default the correlation is
> set to .50 (with a reference to Wampold et al., 1997), but that is the
> presumed correlation within a study, not within the same subsample.
> We'll try and see whether there's any information specifically
> pertaining to the effects we are studying.
>
> Once again, thank you!
> Danka
>
> On Tue, Jan 5, 2021 at 2:07 PM Viechtbauer, Wolfgang (SP)
> <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>
>> To construct an approximate 'V' matrix, something like this should work:
>>
>> impute_covariance_matrix(MA_dat_raw$SV, cluster=paste0(MA_dat_raw$IDstudy, ".", MA_dat_raw$IDsubsample), r = 0.6)
>>
>> I paste together the study and sample ID variables, but this is only necessary if IDsubsample is coded in such a way that the same value of IDsubsample might be used for two different studies. For example, if:
>>
>> IDstudy IDsubsample IDOutcome IDeffect
>> 1 1 outcomeA 1
>> 1 1 outcomeA 2
>> 1 1 outcomeB 1
>> 1 1 outcomeB 2
>> 1 2 outcomeA 1
>> 1 2 outcomeA 2
>> 1 2 outcomeB 1
>> 1 2 outcomeB 2
>> 2 1 outcomeA 1
>> 2 1 outcomeB 1
>>
>> then setting cluster = IDsubsample would not make sense, since it would allow for correlation across studies. On the other hand, if the last two lines were:
>>
>> 2 3 outcomeA 1
>> 2 3 outcomeB 1
>>
>> then using IDsubsample as the cluster variable would be fine.
>>
>> Of course, the r = 0.6 is something that you need to think about.
>>
>> Also, this assumes a single correlation for pairs of estimates, regardless of whether the two estimates are for the same outcome or for different outcomes. Usually, I would expect a stronger correlation for two estimates of the same outcome. Constructing a V matrix that reflects this is more tricky.
>>
>> However, in the end, what you are doing here is just trying to make the 'working model' somewhat more realistic (by not assuming 0 for the correlation, which is in essence what you do when you use V=SV). The cluster-robust inference approach then takes this working model as input and computes the standard errors of the fixed effects in such a way that even if the model is misspecified, the estimated standard errors are (asymptotically) correct.
>>
>> Best,
>> Wolfgang
>>
>>> -----Original Message-----
>>> From: Danka Puric [mailto:djaguard using gmail.com]
>>> Sent: Tuesday, 05 January, 2021 13:28
>>> To: Viechtbauer, Wolfgang (SP)
>>> Cc: James Pustejovsky; R meta
>>> Subject: Re: [R-meta] Dealing with effect size dependance with a small number of
>>> studies
>>>
>>> Dear Wolfgang,
>>>
>>> message received (both times) :)
>>>
>>> I'll respond inline as well.
>>>
>>> On Tue, Jan 5, 2021 at 12:16 PM Viechtbauer, Wolfgang (SP)
>>> <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>>>>
>>>> Ok, I think I am replying to the right post now ...
>>>>
>>>> Responses again below.
>>>>
>>>> Best,
>>>> Wolfgang
>>>>
>>>>> -----Original Message-----
>>>>> From: Danka Puric [mailto:djaguard using gmail.com]
>>>>> Sent: Tuesday, 05 January, 2021 10:36
>>>>> To: Viechtbauer, Wolfgang (SP)
>>>>> Cc: James Pustejovsky; R meta
>>>>> Subject: Re: [R-meta] Dealing with effect size dependance with a small
>>>>> number of studies
>>>>>
>>>>> Dear James, Wolfgang,
>>>>>
>>>>> Thanks a lot for the quick and informative responses!
>>>>>
>>>>> 1. I guess I made things unnecessarily complicated :) The thing is, I know
>>>>> that all these models are essentially the same:
>>>>>
>>>>> notationa <- rma.mv(ES_corrected, SV, random = ~ factor(IDeffect) | IDstudy,
>>>>> data=MA_dat_raw)
>>>>> notationb <- rma.mv(ES_corrected, SV, random = ~ 1 | IDstudy/IDeffect,
>>>>> data=MA_dat_raw)
>>>>> notationc <- rma.mv(ES_corrected, SV, random = ~ IDeffect | IDstudy,
>>>>> data=MA_dat_raw)
>>>>>
>>>>> but I read in the Konstantopoulos (2011) example that they only deal with
>>>>> the dependence arising from effect sizes coming from the same studies, but
>>>>> NOT with dependence arising from multiple ES coming from the same group of
>>>>> participants. I then erroneously concluded that in order to deal with this
>>>>> type of dependence I would need to use struct = "UN", but I understand now
>>>>> that's not the case.
>>>>
>>>> Yes, the 'struct' part is a different issue.
>>>>
>>>> If you directly want to account for dependency due to multiple effect size
>>> estimates coming from the same group of subjects, you would need to calculate
>>> the covariance between the sampling errors and include this in the 'V' matrix
>>> (the second argument in rma.mv(), to which you are passing 'SV'). In addition,
>>> we then also want to use a model like the one above to account for possible
>>> dependency in the underlying true effects. That is in fact how things are done
>>> in the Berkey example (and since 'outcome' is meaningful there, we can use
>>> struct="UN" to have an estimate of tau^2 for the two different outcomes).
>>>
>>> Thanks for the additional clarification! We will try to create "V"
>>> using the impute_covariance_matrix function James suggested and then
>>> use it instead of SV in our syntax:
>>> model <- rma.mv(ES_corrected, V, random = ~ 1 | IDstudy / IDsubsample
>>> / IDeffect, data=MA_dat_raw)
>>>
>>>>> Also, indeed, IDeffect does not refer to the type of outcome in a study.
>>>>> Actually, we do have an outcome variable DV which could be used instead of
>>>>> IDeffect, but sometimes it has the same value for several ESs in the same
>>>>> group of participants, so it didn't seem appropriate to use it in this case.
>>>>> I did realize the model with IDeffect was not structured like Berkey at al.
>>>>> but thought it would be a better option, as IDeffect variables have unique
>>>>> values across IDstudy.
>>>>>
>>>>> As for sigma1.1 and sigma2.1. it's quite possible I just got something mixed
>>>>> up when I compared different notations (I may have plotted the wrong model),
>>>>> but anyway, this model is definitely wrong for the data, so I'll just leave
>>>>> it at that.
>>>>
>>>> Agreed.
>>>>
>>>>> 2. So, just to be sure I got this right, the following model
>>>>>
>>>>> model <- rma.mv(ES_corrected, SV, random = ~ 1 | IDstudy / IDsubsample/
>>>>> IDeffect, data=MA_dat_raw)
>>>>>
>>>>> in combination with clubSandwich robust estimates will yield adequate effect
>>>>> size estimates for the situation where the same group of participants
>>>>> provided more than one ES? That's actually the model I fit first, but then
>>>>> thought wasn't appropriate after all.
>>>>
>>>> Yes, this looks like a sensible approach. In principle, since you mentioned it
>>> above, you could even consider:
>>>>
>>>> random = ~ 1 | IDstudy / IDsubsample / IDOutcome / IDeffect
>>>>
>>>> where IDOutcome is the id variable for the the different outcomes. This model
>>> is in principle possible, since you mentioned that sometimes, within a
>>> particular subsample, there are multiple effects for the same outcome (if this
>>> were not the case, then IDOutcome and IDeffect would not be uniquely
>>> identifiable). However, this may be pushing things a bit with k=69 estimates. In
>>> essence, this is a five-level model, so two levels more than the 'three-level
>>> model' described by Konstantopoulos (2011):
>>>>
>>>> http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011
>>>>
>>>> (in multilevel model parlance, the standard random-effects model would be
>>> considered a two-level model, so the number of levels is 1 + the number of
>>> hierarchical levels you are adding via 'random').
>>>
>>> Yes, adding an IDoutcome variable would actually be a great way to
>>> approach the issue of same/different outcomes coming from the same
>>> subsample. But I also agree that such a model might be too complex for
>>> the data and I would expect both IDoutcome and IDeffect to have
>>> variances very close to zero. Still, we can give it a try and see if
>>> it makes sense.
>>>
>>>>> I will also now look into inputting the covariance matrices and see if it's
>>>>> possible to implement with the data we have. Thanks for suggesting this,
>>>>> James.
>>>>
>>>> In this case, covariances in the sampling errors will occur only within
>>> subsamples. So, the 'V' matrix will be block-diagonal with blocks corresponding
>>> to the subsamples. For example, suppose we have
>>>>
>>>> IDstudy IDsubsample IDOutcome IDeffect
>>>> 1 1 outcomeA 1
>>>> 1 1 outcomeA 2
>>>> 1 1 outcomeB 1
>>>> 1 1 outcomeB 2
>>>> 1 2 outcomeA 1
>>>> 1 2 outcomeA 2
>>>> 1 2 outcomeB 1
>>>> 1 2 outcomeB 2
>>>>
>>>> so a study with two groups and in each group both outcomes (A and B) were
>>> measured in two different ways (e.g., using two different scales), leading to
>>> two different effects. Then the V matrix for this study would be an 8x8 matrix
>>> that is composed of two 4x4 blocks.
>>>
>>> Again, thank you for clarifying this. I assumed this was how it
>>> worked, I'm just still not sure how to create V from the data we have,
>>> but I'll get onto that straight away!
>>>
>>> Thanks so much!
>>> Danka
>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
--
Michael
http://www.dewey.myzen.co.uk/home.html
More information about the R-sig-meta-analysis
mailing list