[R-meta] Inverse weighting after estimation of VCOV

James Pustejovsky jepu@to @end|ng |rom gm@||@com
Fri May 31 15:22:03 CEST 2024


Hi David,

Thanks for clarifying your data structure. Based on what you've described,
I don't think it makes sense to use vcalc(). The point of vcalc() is to
build in covariance between the sampling errors of the effect size
estimates. For your one publication that reports 8 studies, each effect
size estimate is based on a separate sample of participants (because each
estimate comes from a different country). So there's no reason to expect
that there would be covariance in the sampling errors.

Instead, one might suspect that there would be covariance between the
country-specific effect size parameters (i.e., the "true" effect sizes)
from this publication. This would be plausible if the same operational
procedures (e.g., same recruitment approach, same measurement
instrumentation, same follow-up window) were used across the samples in
this publication. The conventional way to model this would be to 1) specify
effect size estimates as independent but 2) include publication-level
random effects in the model to capture shared operational variance within
publications. The syntax would be something like:
res_metaRE <- rma(
  yi, V = vi,
  random = ~ 1 | publicationID / number,
  mods = ~ hospitalbeds + ltcbeds,
  verbose=TRUE,
  data=df_complete, sparse = TRUE
)
You'll need to create a publicationID variable if you don't already have
that on the data.

The difficulty with this approach in your case is that there's only one
publication that has multiple samples nested within it, so there's not a
lot of information available to parse out the variance at the publication
level from the variance at the sample level (across countries). You could
try using the model fit statistics to compare the model above versus a
model that only has random effects at the sample level.

James

On Mon, May 27, 2024 at 8:54 AM David Pedrosa via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:

> Hi James,
>
> apologies, my question was not  seasoned enough.
>
> I have a dataframe with 16 studies, all of which provide some odds
> ratios for hospitalisation. 8 studies are from the same publication but
> on different countries. To me there is still reason to believe they
> “share more variance” than the rest. Besides, I want to weigh the total
> number  of subjects from each of the studies. To make it a bit more
> complex, we have digged out the miner of hospital beds and long term
> beds for every country, both of which we consider potential moderators.
> I ran the random effects model
>
> res_metaRE <- rma(yi, vi,
>   random = ~ 1 | number, mods = ~ hospitalbeds +
> ltcbeds, verbose=TRUE, data=df_complete)
>
> to which weights(res_metaRE) provides accurate results. If I try to
> estimate the VCOV matrix, the results show correct diagonal values, that
> is identical to df_conplete$vi. But sticking the resulting V_mat
>
> V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7)
>
> to rma.mv provides results that are too high but especially the studies
> with lower number of subjects are higher weighted. I am assuming that
> it’s just somehow inverted but I cannot understand if I’m missing
> something or if there is some other mistake in the way I’m estimating
> the VCOV. Number is just the study id.
>
> I’m not entirely sure I understand your point with the subsection of the
> matrix.
>
> Thanks for your help!
> Best,
> David
>
> P.S.: Here are the relevant parts of df_complete
>
> structure(list(number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
> 12, 13, 14, 15), author = c("Aamodt", "Ceylan", "Krause", "Kumar",
> "Moens, Belgium", "Moens, France"Moens, Italy", "Moens, Canada",
> "Moens, Mexiko", "Moens, New Zeeland", "Moens, Spain", "Moens, South
> Corea",
> "Moens, Czech Rep.", "Moens, Hungary", "Moens, USA"), year = c(2023,
> 2022, 2021, 2021, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
> 2015, 2015, 2015), n_ges = c(53279, 27, 40, 346141, 837, 4599,
> 4034, 1381, 1062, 202, 352, 1565, 92, 241, 20065), OR = c(1.06,
> 1.43, 8.25, 1.454, 2.3, 1.5, 1.4, 1.7, 0.95, 1.97, 1.09, 0.95,
> 0.97, 1.44, 1.4), hospitalbeds = c(2.77, 3.02, 7.76, 2.77, 5.47,
> 5.65, 3.12, 2.58, 1, 2.57, 2.96, 12.77, 6.66, 6.79, 2.77), ltcbeds =
> c(32.3,
> 9.5, 54.2, 53.9, 66.8, 47.4, 21.3, 46.7, 0, 50.4, 43.4, 25, 34.9,
> 42.6, 28.9), p_values = c(0.106809128205467, 0.706331045003814,
> 0.0281267337718951, 0, 2.43772276381116e-05, 2.76746355676653e-22,
> 1.01260208850919e-05, 1.19251123951374e-10, 0.772759462747246,
> 0.0741077696800058, 0.74088983860122, 0.68164335922065, 1,
> 0.183303852299051,
> 3.20176730771634e-26), shared_variance = c(0, 0, 0, 0, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1), yi = structure(c(0.0582689081239758,
> 0.357674444271816, 2.11021320034659, 0.374318379111328, 0.832909122935104,
> 0.405465108108164, 0.336472236621213, 0.53062825106217,
> -0.0512932943875506,
> 0.678033542749897, 0.0861776962410524, -0.0512932943875506,
> -0.0304592074847086,
> 0.364643113587909, 0.336472236621213), ni = c(53279, 27, 40,
> 346141, 837, 4599, 4034, 1381, 1062, 202, 352, 1565, 92, 241,
> 20065), measure = "GEN"), vi = c(0.000835840725678602, 0.638632983584221,
> 0.604067037193667, 0.000435509388232691, 0.0467214213223696,
> 0.00468347897652763, 0.00538603813506437, 0.0132951153208062,
> 0.0214123920152818, 0.142112789690683, 0.0489441998392354,
> 0.0138688993962097,
> 0.186242249276727, 0.0702159732616764, 0.00133268716433697)), row.names
> = c(NA,
> -15L), class = c("escalc", "data.frame"), yi.names = "yi", vi.names =
> "vi", digits = c(est = 4,
> se = 4, test = 4, pval = 4, ci = 4, var = 4, sevar = 4, fit = 4,
> het = 4))
>
> Am 24.05.2024 um 19:06 schrieb James Pustejovsky:
> > Hi David,
> >
> > I don't entirely understand the models that you're looking at, so
> > clarifying the following would help in getting good feedback:
> > * What is the variable `shared_variance` used in the vcalc call?
> > * What is the variable `number` used in the random effects argument of
> > rma.mv <http://rma.mv>?
> > * How are these variables related?
> >
> > Additionally, it would be good to check that the vcov matrix created
> > by vcalc() is as you intend it to be. Could you pull out the blocks of
> > this matrix for a few studies and just verify that they give you
> > covariance matrices with a correlation of 0.7? I mean something like:
> > vcov_study_k <- V_mat[i:j, i:j]
> > cov2cor(vcov_study_k)
> > where the indices i:j are the rows in your data corresponding to a
> > given study k.
> >
> > James
> >
> > On Fri, May 24, 2024 at 10:00 AM David Pedrosa via R-sig-meta-analysis
> > <r-sig-meta-analysis using r-project.org> wrote:
> >
> >     Dear all,
> >
> >     I have a basic question about the output of my (gu)estimation of the
> >     variance-covariance matrix. I have extracted results from very
> >     heterogeneous studies with OR as effect size (sample sizes between 20
> >     and 300,000). Since some of the results come from the same study, I
> >     decided to try to use the VCOV as an input and estimated values
> >     according to the following formula
> >
> >     V_mat  <- vcalc(vi=vi, cluster=shared_variance, data=df_complete,
> >     rho=.7)
> >     res_meta     <- rma.mv <http://rma.mv>(yi, vi, V=V_mat,
> >                              random = ~ 1 | number, mods = ~
> >     hospitalbeds +
> >     ltcbeds, verbose=TRUE, data=df_complete)
> >
> >
> >     Interestingly, in this case the weighting is reversed, so that
> >     most of
> >     the weight is given to studies with the smallest sample size;
> >     something
> >     that does not happen when using this formula:
> >
> >     res_meta     <- rma(yi, vi,
> >                              random = ~ 1 | number, mods = ~
> >     hospitalbeds +
> >     ltcbeds, verbose=TRUE, data=df_complete)
> >
> >     I have tried to understand what is going on, but I am at kind of
> >     lost.
> >     Could someone please give me some advice?
> >
> >     Thanks in advance,
> >
> >     David
> >
> >     _______________________________________________
> >     R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> >     To manage your subscription to this mailing list, go to:
> >     https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
> >
> --
> Uni Marburg Siegel
> <
> https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun>
>
>
>
>
>
> Prof. Dr. David Pedrosa
> Leitender Oberarzt der Klinik für Neurologie,
> Leiter der Sektion Bewegungsstörungen und Neuromodulation,
> Universitätsklinikum Gießen und Marburg
> Tel. (+49) 6421-58 65299 Fax. (+49) 6421-58 67055
> Address. Baldingerstr., 35043 Marburg
> Web. https://www.ukgm.de/ugm_2/deu/umr_neu/index.html
> Web.
>
> https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun
>
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list