[R-meta] Dealing with effect size dependance with a small number of studies

Wed Jan 6 10:00:05 CET 2021

Dear Wolfgang,

To be honest, I wasn't even expecting a reply to my last email, and
then I got a ready-to-go syntax with a clear explanation, so thank
you, this is exactly what we needed! Our IDsubsample variable has
unique values across the whole dataset, so I ended up using only that.
I checked the resulting matrix and it looks as it should, i.e.
block-diagonal.

And yes, we'll have to figure out which exact value of the correlation
to assume. For example, in this function:
https://rdrr.io/cran/MAd/man/agg.html by default the correlation is
set to .50 (with a reference to Wampold et al., 1997), but that is the
presumed correlation within a study, not within the same subsample.
We'll try and see whether there's any information specifically
pertaining to the effects we are studying.

Once again, thank you!
Danka

On Tue, Jan 5, 2021 at 2:07 PM Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
> To construct an approximate 'V' matrix, something like this should work:
>
> impute_covariance_matrix(MA_dat_raw$SV, cluster=paste0(MA_dat_raw$IDstudy, ".", MA_dat_raw$IDsubsample), r = 0.6)
>
> I paste together the study and sample ID variables, but this is only necessary if IDsubsample is coded in such a way that the same value of IDsubsample might be used for two different studies. For example, if:
>
> IDstudy IDsubsample IDOutcome IDeffect
> 1       1           outcomeA  1
> 1       1           outcomeA  2
> 1       1           outcomeB  1
> 1       1           outcomeB  2
> 1       2           outcomeA  1
> 1       2           outcomeA  2
> 1       2           outcomeB  1
> 1       2           outcomeB  2
> 2       1           outcomeA  1
> 2       1           outcomeB  1
>
> then setting cluster = IDsubsample would not make sense, since it would allow for correlation across studies. On the other hand, if the last two lines were:
>
> 2       3           outcomeA  1
> 2       3           outcomeB  1
>
> then using IDsubsample as the cluster variable would be fine.
>
> Of course, the r = 0.6 is something that you need to think about.
>
> Also, this assumes a single correlation for pairs of estimates, regardless of whether the two estimates are for the same outcome or for different outcomes. Usually, I would expect a stronger correlation for two estimates of the same outcome. Constructing a V matrix that reflects this is more tricky.
>
> However, in the end, what you are doing here is just trying to make the 'working model' somewhat more realistic (by not assuming 0 for the correlation, which is in essence what you do when you use V=SV). The cluster-robust inference approach then takes this working model as input and computes the standard errors of the fixed effects in such a way that even if the model is misspecified, the estimated standard errors are (asymptotically) correct.
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: Danka Puric [mailto:djaguard using gmail.com]
> >Sent: Tuesday, 05 January, 2021 13:28
> >To: Viechtbauer, Wolfgang (SP)
> >Cc: James Pustejovsky; R meta
> >Subject: Re: [R-meta] Dealing with effect size dependance with a small number of
> >studies
> >
> >Dear Wolfgang,
> >
> >message received (both times) :)
> >
> >I'll respond inline as well.
> >
> >On Tue, Jan 5, 2021 at 12:16 PM Viechtbauer, Wolfgang (SP)
> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >>
> >> Ok, I think I am replying to the right post now ...
> >>
> >> Responses again below.
> >>
> >> Best,
> >> Wolfgang
> >>
> >> >-----Original Message-----
> >> >From: Danka Puric [mailto:djaguard using gmail.com]
> >> >Sent: Tuesday, 05 January, 2021 10:36
> >> >To: Viechtbauer, Wolfgang (SP)
> >> >Cc: James Pustejovsky; R meta
> >> >Subject: Re: [R-meta] Dealing with effect size dependance with a small
> >> >number of studies
> >> >
> >> >Dear James, Wolfgang,
> >> >
> >> >Thanks a lot for the quick and informative responses!
> >> >
> >> >1. I guess I made things unnecessarily complicated :) The thing is, I know
> >> >that all these models are essentially the same:
> >> >
> >> >notationa <- rma.mv(ES_corrected, SV, random = ~ factor(IDeffect) | IDstudy,
> >> >data=MA_dat_raw)
> >> >notationb <- rma.mv(ES_corrected, SV, random = ~ 1 | IDstudy/IDeffect,
> >> >data=MA_dat_raw)
> >> >notationc <- rma.mv(ES_corrected, SV, random = ~ IDeffect | IDstudy,
> >> >data=MA_dat_raw)
> >> >
> >> >but I read in the Konstantopoulos (2011) example that they only deal with
> >> >the dependence arising from effect sizes coming from the same studies, but
> >> >NOT with dependence arising from multiple ES coming from the same group of
> >> >participants. I then erroneously concluded that in order to deal with this
> >> >type of dependence I would need to use struct = "UN", but I understand now
> >> >that's not the case.
> >>
> >> Yes, the 'struct' part is a different issue.
> >>
> >> If you directly want to account for dependency due to multiple effect size
> >estimates coming from the same group of subjects, you would need to calculate
> >the covariance between the sampling errors and include this in the 'V' matrix
> >(the second argument in rma.mv(), to which you are passing 'SV'). In addition,
> >we then also want to use a model like the one above to account for possible
> >dependency in the underlying true effects. That is in fact how things are done
> >in the Berkey example (and since 'outcome' is meaningful there, we can use
> >struct="UN" to have an estimate of tau^2 for the two different outcomes).
> >
> >Thanks for the additional clarification! We will try to create "V"
> >using the impute_covariance_matrix function James suggested and then
> >use it instead of SV in our syntax:
> >model <- rma.mv(ES_corrected, V, random =  ~ 1 | IDstudy / IDsubsample
> >/ IDeffect, data=MA_dat_raw)
> >
> >> >Also, indeed, IDeffect does not refer to the type of outcome in a study.
> >> >Actually, we do have an outcome variable DV which could be used instead of
> >> >IDeffect, but sometimes it has the same value for several ESs in the same
> >> >group of participants, so it didn't seem appropriate to use it in this case.
> >> >I did realize the model with IDeffect was not structured like Berkey at al.
> >> >but thought it would be a better option, as IDeffect variables have unique
> >> >values across IDstudy.
> >> >
> >> >As for sigma1.1 and sigma2.1. it's quite possible I just got something mixed
> >> >up when I compared different notations (I may have plotted the wrong model),
> >> >but anyway, this model is definitely wrong for the data, so I'll just leave
> >> >it at that.
> >>
> >> Agreed.
> >>
> >> >2. So, just to be sure I got this right, the following model
> >> >
> >> >model <- rma.mv(ES_corrected, SV, random =  ~ 1 | IDstudy / IDsubsample/
> >> >IDeffect, data=MA_dat_raw)
> >> >
> >> >in combination with clubSandwich robust estimates will yield adequate effect
> >> >size estimates for the situation where the same group of participants
> >> >provided more than one ES? That's actually the model I fit first, but then
> >> >thought wasn't appropriate after all.
> >>
> >> Yes, this looks like a sensible approach. In principle, since you mentioned it
> >above, you could even consider:
> >>
> >> random =  ~ 1 | IDstudy / IDsubsample / IDOutcome / IDeffect
> >>
> >> where IDOutcome is the id variable for the the different outcomes. This model
> >is in principle possible, since you mentioned that sometimes, within a
> >particular subsample, there are multiple effects for the same outcome (if this
> >were not the case, then IDOutcome and IDeffect would not be uniquely
> >identifiable). However, this may be pushing things a bit with k=69 estimates. In
> >essence, this is a five-level model, so two levels more than the 'three-level
> >model' described by Konstantopoulos (2011):
> >>
> >> http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011
> >>
> >> (in multilevel model parlance, the standard random-effects model would be
> >considered a two-level model, so the number of levels is 1 + the number of
> >hierarchical levels you are adding via 'random').
> >
> >Yes, adding an IDoutcome variable would actually be a great way to
> >approach the issue of same/different outcomes coming from the same
> >subsample. But I also agree that such a model might be too complex for
> >the data and I would expect both IDoutcome and IDeffect to have
> >variances very close to zero. Still, we can give it a try and see if
> >it makes sense.
> >
> >> >I will also now look into inputting the covariance matrices and see if it's
> >> >possible to implement with the data we have. Thanks for suggesting this,
> >> >James.
> >>
> >> In this case, covariances in the sampling errors will occur only within
> >subsamples. So, the 'V' matrix will be block-diagonal with blocks corresponding
> >to the subsamples. For example, suppose we have
> >>
> >> IDstudy IDsubsample IDOutcome IDeffect
> >> 1       1           outcomeA  1
> >> 1       1           outcomeA  2
> >> 1       1           outcomeB  1
> >> 1       1           outcomeB  2
> >> 1       2           outcomeA  1
> >> 1       2           outcomeA  2
> >> 1       2           outcomeB  1
> >> 1       2           outcomeB  2
> >>
> >> so a study with two groups and in each group both outcomes (A and B) were
> >measured in two different ways (e.g., using two different scales), leading to
> >two different effects. Then the V matrix for this study would be an 8x8 matrix
> >that is composed of two 4x4 blocks.
> >
> >Again, thank you for clarifying this. I assumed this was how it
> >worked, I'm just still not sure how to create V from the data we have,
> >but I'll get onto that straight away!
> >
> >Thanks so much!
> >Danka