[R-meta] Multilevel model between-study variance

Wed Jul 10 19:16:16 CEST 2024

Dear Frederik

I would have thought there is a good case here for reporting both your 
model 1 and model 3. If they lead to very different results you may have 
some head scratching to do but if they are not too far apart then 
perhaps pick model 1 and report model 3 as a sensitivity analysis.

Michael

On 10/07/2024 16:45, Frederik Zirn via R-sig-meta-analysis wrote:
> Hi James,
> 
> Thank you very much for your clear and helpful answer. I learned a lot!
> 
> I tried changing the value of rho, however, even if I specify rho to be 0, my between-study variance remains 0.
> 
> Regarding the equivalence of the aggregated approach and the "correlated effects" model: at first I was puzzled, because I did not get the same results for the two versions. After further inspection I found out why: the two are equivalent if one computes weighted averages per study. I used the package MAd, function "agg" with Borenstein et al. (2021) formulas, which uses unweighted averages. Using weighted averages as you did (with the aggregate function), I do in fact get the same results.
> 
> The last point you made, I guess, hit the nail on the head and explains why there is no between study variance in my multivariate model. 25 (!!) of the 41 effect sizes stem from one dissertation alone. Leaving that dissertation out leads to the following variance components:
>                       estim    sqrt  nlvls  fixed       factor
> sigma^2.1  0.0458  0.2140     11     no        Study
> sigma^2.2  0.0017  0.0407     16     no  Study/ES_ID
> 
> Now I am wondering what to do with that dissertation. As I see it, there are three options:
> 
> 1) Relying on the multilevel model and INCLUDING the dissertation. This doesn't feel quite right as the results of my analyses (on top of examining the overall effect, I'd like to report results of 3 moderator analyses which I pre-specified prior to my search - I know that I have to be careful with interpretation due to low sample size) are probably influenced by the fact that there is no between-study variance detected by the model? The reason for reporting no between study variance in my article would then simply be the inclusion of that dissertation. The CHE model would be pretty tenuous because of the distribution of effect sizes.
> 
> 2) Using within-study averaging (which, according to your article is equivalent to a correlated effects model) and INCLUDING the dissertation.
> If I go for that option: would you consider weighted averaging per study superior to unweighted averaging as proposed by Borenstein et al (2021)?
> There is one thing I do not get when using the weighted approach (test_agg <- aggregate(all_study_designs_combined, cluster = Study, rho = 0.59). The pooled effect size for the dissertation is negative then (-0.22 compared to being 0.5 when using the unweighted average). That is kind of strange as only 2 effect sizes out of 25 are negative. Those 2 have a relatively small variance, however, there are a many many other positive effect sizes having approximately the same variance. It just doesn't add up for me why the overall estimate should be negative. Thus, I calculated the average effect size by hand with inverse variance weighting and got an aggregated effect size of 0.28. In order to find the reason for the discrepancy with the outcome in R I changed the rho value in R to 0 (test_agg <- aggregate(all_study_designs_combined, cluster = Study, rho = 0). Result: the effect size in R is now also 0.28. Why would a rho-value of 0.59 make the overall effect negative (and higher rho values making it even more negative)?
> 
> 3) Relying on the multilevel model WITHOUT the dissertation (even though it met my inclusion criteria). I guess I would need good justification for that.
> 
> Thank you very much for your valuable input!
> 
> All the best
> Frederik
> 
> 
> Am Mittwoch, Juli 10, 2024 04:27 CEST, schrieb James Pustejovsky <jepusto using gmail.com>:
> 
>> Hi Frederik,
>>
>> Your interpretation of the parameter estimates is correct, but the
>> estimator of between-study heterogeneity may be quite imprecise if you have
>> only 12 studies. You can get a sense of this by computing profile
>> likelihood  confidence intervals for the variance components:
>> confint(che.model)
>> Note that these confidence intervals are based on the assumptions of the
>> working model, so they are not robust to mis-specification in the same
>> sense that robust CIs for the average effect size are robust.
>>
>> Both the point estimate of between-study heterogeneity and its confidence
>> interval could be pretty sensitive to the assumed value of rho (the
>> correlation of the sampling errors). Using a smaller value of rho will tend
>> to produce larger estimates of between-study heterogeneity and somewhat
>> smaller estimates of within-study heterogeneity. So it's definitely helpful
>> to use empirical data to inform this assumption (as you've done) and to
>> conduct sensitivity analyses (as you've also noted).
>>
>> To your broader question about whether aggregating is state-of-the-art, I
>> have a new paper (with Man Chen) about exactly this question:
>> https://jepusto.com/publications/Equivalences-between-ad-hoc-strategies-and-models/
>> We argue that aggregating is neither correct nor incorrect, but is rather
>> just a different working model that might or might not be reasonable for
>> your data. Specifically, we show that aggregating is exactly equivalent to
>> using a multivariate working model that has a between-study variance
>> component but no within-study variance component (which we call the
>> "correlated effects" working model), as in the following:
>> ce.model <- rma.mv(yi = yi,
>>                      V = V,
>>                      random = ~ 1 | Study,
>>                      data = all_study_designs_combined)
>> This way of representing the model is helpful because it puts the CE model
>> on equal footing with the CHE model and allows them to be directly compared
>> via likelihood ratio tests:
>> anova(che.model, ce.model)
>> Or otherwise compared and discussed as alternatives.
>>
>> One other note: the sensitivity of the variance component estimates is also
>> affected by the distribution of effect sizes per study. It is therefore
>> useful to examine and report the range of effect estimates per study. If 10
>> of your studies each have a single effect and 2 each have many effect
>> sizes, then the CHE will be much more tenuous than if all twelve studies
>> have 3 or 4 effect sizes each.
>>
>> James
>>
>> On Mon, Jul 8, 2024 at 5:08 AM Frederik Zirn via R-sig-meta-analysis <
>> r-sig-meta-analysis using r-project.org> wrote:
>>
>>> Dear R-sig-meta-analysis community,
>>>
>>> I am a PhD student conducting a meta-analysis of 12 studies with 41 effect
>>> sizes. There is dependency among my effect sizes as several studies measure
>>> the same outcome at multiple time points or multiple outcomes are measured
>>> in the same study.
>>>
>>> My first approach was to aggregate effect sizes per study using the
>>> agg-function of the Mad package.
>>> all_study_designs_combined_agg <- agg(id = Study, es = yi, var = vi,
>>> method = "BHHR", cor = 0.59, data=all_study_designs_combined)
>>>
>>> cor = 0.59 is based on the correlations reported within a study of my
>>> meta-analysis. I am conducting sensitivity analyses with other values.
>>>
>>> 1) However, aggregating within studies is no longer considered
>>> state-of-the-art practice, correct? Or is this still a valid approach to
>>> handle dependent effect sizes?
>>>
>>> Consequently, I aimed to create a multilevel model. Here is the code I
>>> used:
>>>
>>> Fitting a CHE Model with Robust Variance Estimation
>>> constant sampling correlation assumption rho <- 0.59
>>>
>>> constant sampling correlation working model
>>> V <- with(all_study_designs_combined,
>>>            impute_covariance_matrix(vi = vi,
>>>                                     cluster = Study,
>>>                                     r = rho))
>>>
>>> che.model <- rma.mv(yi = yi,
>>>                      V = V,
>>>                      random = ~ 1 | Study/ES_ID,
>>>                      data = all_study_designs_combined)
>>> che.model
>>>
>>> robust variance
>>> full.model.robust <- robust(che.model,
>>> cluster=all_study_designs_combined$Study, clubSandwich = TRUE)
>>> summary(full.model.robust)
>>>
>>> Doing so, I receive the following variance components:
>>> sigma^2.1  0.0000  0.0000     12     no        Study
>>> sigma^2.2  0.1286  0.3587     41     no  Study/ES_ID
>>>
>>> 2) I have trouble interpreting those findings. Does that indicate that all
>>> of the variance in my model comes from within-study variance, and I have no
>>> between-study variance? This does not seem plausible to me. Am I overseeing
>>> something here? Could that be due to the limited sample size (12 studies)?
>>>
>>> Thanks in advance,
>>> Frederik Zirn
>>> PhD student
>>> Chair of Corporate Education
>>> University of Konstanz
>>>
>>> _______________________________________________
>>> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
>>> To manage your subscription to this mailing list, go to:
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>
> 
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
> 

-- 
Michael