[R-meta] Inverse weighting after estimation of VCOV
pedros@c m@iii@g oii st@ii@u@i-m@rburg@de
pedros@c m@iii@g oii st@ii@u@i-m@rburg@de
Mon Jun 3 23:49:50 CEST 2024
Dear James and Wolfgang,
I think I now understand the reason why it was weighting my data so
strangely. And your explanations, James, are enough food for though for
some days. Thank you very much!
Best,
David
Am 03.06.2024 um 09:29 schrieb Viechtbauer, Wolfgang (NP):
> One additional point (leaving aside the issue what the appropriate model for these data is):
>
> rma.mv(yi, vi, V=V_mat, ...)
>
> doesn't make sense. The second unnamed argument ('vi') will be matched to the third argument of rma.mv(), which is argument 'W' for the weights. So you are setting the weights equal to 'vi', which indeed will give more weight to the less precise estimates. Generally, you don't want to manually specify weights (especially in more complex models where there can be an entire weight matrix) unless you really know what you are doing.
>
> Best,
> Wolfgang
>
>> -----Original Message-----
>> From: R-sig-meta-analysis<r-sig-meta-analysis-bounces using r-project.org> On Behalf
>> Of James Pustejovsky via R-sig-meta-analysis
>> Sent: Friday, May 31, 2024 15:22
>> To:pedrosac using staff.uni-marburg.de; R Special Interest Group for Meta-Analysis
>> <r-sig-meta-analysis using r-project.org>
>> Cc: James Pustejovsky<jepusto using gmail.com>
>> Subject: Re: [R-meta] Inverse weighting after estimation of VCOV
>>
>> Hi David,
>>
>> Thanks for clarifying your data structure. Based on what you've described,
>> I don't think it makes sense to use vcalc(). The point of vcalc() is to
>> build in covariance between the sampling errors of the effect size
>> estimates. For your one publication that reports 8 studies, each effect
>> size estimate is based on a separate sample of participants (because each
>> estimate comes from a different country). So there's no reason to expect
>> that there would be covariance in the sampling errors.
>>
>> Instead, one might suspect that there would be covariance between the
>> country-specific effect size parameters (i.e., the "true" effect sizes)
>> from this publication. This would be plausible if the same operational
>> procedures (e.g., same recruitment approach, same measurement
>> instrumentation, same follow-up window) were used across the samples in
>> this publication. The conventional way to model this would be to 1) specify
>> effect size estimates as independent but 2) include publication-level
>> random effects in the model to capture shared operational variance within
>> publications. The syntax would be something like:
>> res_metaRE <- rma(
>> yi, V = vi,
>> random = ~ 1 | publicationID / number,
>> mods = ~ hospitalbeds + ltcbeds,
>> verbose=TRUE,
>> data=df_complete, sparse = TRUE
>> )
>> You'll need to create a publicationID variable if you don't already have
>> that on the data.
>>
>> The difficulty with this approach in your case is that there's only one
>> publication that has multiple samples nested within it, so there's not a
>> lot of information available to parse out the variance at the publication
>> level from the variance at the sample level (across countries). You could
>> try using the model fit statistics to compare the model above versus a
>> model that only has random effects at the sample level.
>>
>> James
>>
>> On Mon, May 27, 2024 at 8:54 AM David Pedrosa via R-sig-meta-analysis <
>> r-sig-meta-analysis using r-project.org> wrote:
>>
>>> Hi James,
>>>
>>> apologies, my question was not seasoned enough.
>>>
>>> I have a dataframe with 16 studies, all of which provide some odds
>>> ratios for hospitalisation. 8 studies are from the same publication but
>>> on different countries. To me there is still reason to believe they
>>> “share more variance” than the rest. Besides, I want to weigh the total
>>> number of subjects from each of the studies. To make it a bit more
>>> complex, we have digged out the miner of hospital beds and long term
>>> beds for every country, both of which we consider potential moderators.
>>> I ran the random effects model
>>>
>>> res_metaRE <- rma(yi, vi,
>>> random = ~ 1 | number, mods = ~ hospitalbeds +
>>> ltcbeds, verbose=TRUE, data=df_complete)
>>>
>>> to which weights(res_metaRE) provides accurate results. If I try to
>>> estimate the VCOV matrix, the results show correct diagonal values, that
>>> is identical to df_conplete$vi. But sticking the resulting V_mat
>>>
>>> V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7)
>>>
>>> to rma.mv provides results that are too high but especially the studies
>>> with lower number of subjects are higher weighted. I am assuming that
>>> it’s just somehow inverted but I cannot understand if I’m missing
>>> something or if there is some other mistake in the way I’m estimating
>>> the VCOV. Number is just the study id.
>>>
>>> I’m not entirely sure I understand your point with the subsection of the
>>> matrix.
>>>
>>> Thanks for your help!
>>> Best,
>>> David
>>>
>>> P.S.: Here are the relevant parts of df_complete
>>>
>>> structure(list(number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>>> 12, 13, 14, 15), author = c("Aamodt", "Ceylan", "Krause", "Kumar",
>>> "Moens, Belgium", "Moens, France"Moens, Italy", "Moens, Canada",
>>> "Moens, Mexiko", "Moens, New Zeeland", "Moens, Spain", "Moens, South
>>> Corea",
>>> "Moens, Czech Rep.", "Moens, Hungary", "Moens, USA"), year = c(2023,
>>> 2022, 2021, 2021, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
>>> 2015, 2015, 2015), n_ges = c(53279, 27, 40, 346141, 837, 4599,
>>> 4034, 1381, 1062, 202, 352, 1565, 92, 241, 20065), OR = c(1.06,
>>> 1.43, 8.25, 1.454, 2.3, 1.5, 1.4, 1.7, 0.95, 1.97, 1.09, 0.95,
>>> 0.97, 1.44, 1.4), hospitalbeds = c(2.77, 3.02, 7.76, 2.77, 5.47,
>>> 5.65, 3.12, 2.58, 1, 2.57, 2.96, 12.77, 6.66, 6.79, 2.77), ltcbeds =
>>> c(32.3,
>>> 9.5, 54.2, 53.9, 66.8, 47.4, 21.3, 46.7, 0, 50.4, 43.4, 25, 34.9,
>>> 42.6, 28.9), p_values = c(0.106809128205467, 0.706331045003814,
>>> 0.0281267337718951, 0, 2.43772276381116e-05, 2.76746355676653e-22,
>>> 1.01260208850919e-05, 1.19251123951374e-10, 0.772759462747246,
>>> 0.0741077696800058, 0.74088983860122, 0.68164335922065, 1,
>>> 0.183303852299051,
>>> 3.20176730771634e-26), shared_variance = c(0, 0, 0, 0, 1, 1,
>>> 1, 1, 1, 1, 1, 1, 1, 1, 1), yi = structure(c(0.0582689081239758,
>>> 0.357674444271816, 2.11021320034659, 0.374318379111328, 0.832909122935104,
>>> 0.405465108108164, 0.336472236621213, 0.53062825106217,
>>> -0.0512932943875506,
>>> 0.678033542749897, 0.0861776962410524, -0.0512932943875506,
>>> -0.0304592074847086,
>>> 0.364643113587909, 0.336472236621213), ni = c(53279, 27, 40,
>>> 346141, 837, 4599, 4034, 1381, 1062, 202, 352, 1565, 92, 241,
>>> 20065), measure = "GEN"), vi = c(0.000835840725678602, 0.638632983584221,
>>> 0.604067037193667, 0.000435509388232691, 0.0467214213223696,
>>> 0.00468347897652763, 0.00538603813506437, 0.0132951153208062,
>>> 0.0214123920152818, 0.142112789690683, 0.0489441998392354,
>>> 0.0138688993962097,
>>> 0.186242249276727, 0.0702159732616764, 0.00133268716433697)), row.names
>>> = c(NA,
>>> -15L), class = c("escalc", "data.frame"), yi.names = "yi", vi.names =
>>> "vi", digits = c(est = 4,
>>> se = 4, test = 4, pval = 4, ci = 4, var = 4, sevar = 4, fit = 4,
>>> het = 4))
>>>
>>> Am 24.05.2024 um 19:06 schrieb James Pustejovsky:
>>>> Hi David,
>>>>
>>>> I don't entirely understand the models that you're looking at, so
>>>> clarifying the following would help in getting good feedback:
>>>> * What is the variable `shared_variance` used in the vcalc call?
>>>> * What is the variable `number` used in the random effects argument of
>>>> rma.mv?
>>>> * How are these variables related?
>>>>
>>>> Additionally, it would be good to check that the vcov matrix created
>>>> by vcalc() is as you intend it to be. Could you pull out the blocks of
>>>> this matrix for a few studies and just verify that they give you
>>>> covariance matrices with a correlation of 0.7? I mean something like:
>>>> vcov_study_k <- V_mat[i:j, i:j]
>>>> cov2cor(vcov_study_k)
>>>> where the indices i:j are the rows in your data corresponding to a
>>>> given study k.
>>>>
>>>> James
>>>>
>>>> On Fri, May 24, 2024 at 10:00 AM David Pedrosa via R-sig-meta-analysis
>>>> <r-sig-meta-analysis using r-project.org> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I have a basic question about the output of my (gu)estimation of the
>>>> variance-covariance matrix. I have extracted results from very
>>>> heterogeneous studies with OR as effect size (sample sizes between 20
>>>> and 300,000). Since some of the results come from the same study, I
>>>> decided to try to use the VCOV as an input and estimated values
>>>> according to the following formula
>>>>
>>>> V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete,
>>>> rho=.7)
>>>> res_meta <- rma.mv(yi, vi, V=V_mat,
>>>> random = ~ 1 | number, mods = ~
>>>> hospitalbeds +
>>>> ltcbeds, verbose=TRUE, data=df_complete)
>>>>
>>>>
>>>> Interestingly, in this case the weighting is reversed, so that
>>>> most of
>>>> the weight is given to studies with the smallest sample size;
>>>> something
>>>> that does not happen when using this formula:
>>>>
>>>> res_meta <- rma(yi, vi,
>>>> random = ~ 1 | number, mods = ~
>>>> hospitalbeds +
>>>> ltcbeds, verbose=TRUE, data=df_complete)
>>>>
>>>> I have tried to understand what is going on, but I am at kind of
>>>> lost.
>>>> Could someone please give me some advice?
>>>>
>>>> Thanks in advance,
>>>>
>>>> David
--
Uni Marburg Siegel
<https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun>
Prof. Dr. David Pedrosa
Leitender Oberarzt der Klinik für Neurologie,
Leiter der Sektion Bewegungsstörungen und Neuromodulation,
Universitätsklinikum Gießen und Marburg
Tel. (+49) 6421-58 65299 Fax. (+49) 6421-58 67055
Address. Baldingerstr., 35043 Marburg
Web. https://www.ukgm.de/ugm_2/deu/umr_neu/index.html
Web.
https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list