[R-meta] metafor package in R - Risk ratios using rma.mv()
Olivia Cords
ocord@ @end|ng |rom @|umn|@@t@n|ord@edu
Thu Aug 8 23:44:51 CEST 2019
Dear Michael,
Thanks so much for directing me to that resource. This makes a lot of sense
and Q1 and Q4 have been resolved.
Best,
Olivia
On Thu, Aug 8, 2019 at 12:58 AM Michael Dewey <lists using dewey.myzen.co.uk>
wrote:
> Dear Olivia
>
> I cannot answer all your questions but I think the first section can be
> addressed by studying
>
> http://www.metafor-project.org/doku.php/tips:comp_two_independent_estimates
>
> You need to read a fair way in to get the answer but I think the issue
> is that the separate models allow for different tau^2 whereas the common
> model has a single estimate. The link shows you how to alter that.
>
> I will leave the rest for others to have a go at.
>
> Michael
>
> On 07/08/2019 23:08, Olivia Cords wrote:
> > Dear Wolfgang,
> > Thanks so much again for your help. The comments you provided have been
> > incredibly useful and I have implemented your suggestions. I just wanted
> to
> > follow up with you on this topic with a few additional clarifications
> > regarding incident rates, incident rate ratios, confidence intervals, and
> > similar issues for prevalence/odds ratios.
> >
> > *Incidence Rates*
> > I calculated the incidence rates (measure = "IR", ti =
> > data$person_years/1000) for tuberculosis stratified by World Health
> > Organization Region and got the following results from the model.
> >
> > m0 <- rma.mv(yi, vi, method='REML', mods = formula,
> > random= ~ 1 | study_id/cohort_id,
> > tdist=TRUE,
> > data=pd_ec)
> >
> > estimate se tval
> pval
> > intrcpt 3.5210 0.9728 3.6195
> 0.0003
> > who_region.1African Region 14.5115 2.6764 5.4219
> <.0001
> > who_region.1Eastern Mediterranean Region -0.9006 3.8084 -0.2365
> 0.8132
> > who_region.1European Region 5.2299 2.2226 2.3531
> 0.0191
> > who_region.1South-East Asian Region 7.2904 3.7296 1.9547
> 0.0513
> > who_region.1Western Pacific Region 2.1339 3.3186 0.6430
> 0.5206
> > ci.lb ci.ub
> > intrcpt 1.6083 5.4337 ***
> > who_region.1African Region 9.2492 19.7738 ***
> > who_region.1Eastern Mediterranean Region -8.3885 6.5873
> > who_region.1European Region 0.8600 9.5998 *
> > who_region.1South-East Asian Region -0.0426 14.6235 .
> > who_region.1Western Pacific Region -4.3910 8.6587
> >
> > To find the incidence rates, I added the slopes to the intercept to
> > produce the following Incident Rates/1000 person years:
> > Intercpt (Americas) 3.521
> > African Region 18.03249
> > Eastern Mediterranean Region 2.620433
> > European Region 8.750917
> > South-East Asian Region 10.81145
> > Western Pacific Region 5.654908
> >
> > To confirm these results I also ran the model on the dataset subsetted by
> > region, using the 'estimate' as each region's incidence rate. This
> resulted
> > in the following:
> >
> > Region of the Americas
> > 3.244745079
> > African Region
> > 19.06.383209
> > Eastern Mediterranean Region 2.491278082
> > European Region 7.675655135
> > South-East Asian Region
> > 11.48738471
> > Western Pacific Region 5.600276079
> >
> > *Q1) Why are the Incidence rates different when using the entire dataset
> > versus first subsetting by region? *
> >
> > *Rate Ratios:*
> >
> > Following your suggestion above for calculating rate ratios for each WHO
> > Region, I used the measure = "IRLN" and exponentiated the coefficients of
> > the model. I got the following model output:
> >
> > Multivariate Meta-Analysis Model (k = 390; method: REML)
> >
> > Variance Components:
> >
> > estim sqrt nlvls fixed factor
> > sigma^2.1 2.3898 1.5459 76 no study_id
> > sigma^2.2 0.5833 0.7637 390 no study_id/cohort_id
> >
> > Test for Residual Heterogeneity:
> > QE(df = 384) = 118469.9261, p-val < .0001
> >
> > Test of Moderators (coefficients 2:6):
> > F(df1 = 5, df2 = 384) = 7.3370, p-val < .0001
> >
> > Model Results:
> >
> > estimate se tval
> pval
> > intrcpt -7.1326 0.2449 -29.1259
> <.0001
> > who_region.1African Region 3.1798 0.6332 5.0217
> <.0001
> > who_region.1Eastern Mediterranean Region 0.9541 0.9634 0.9904
> 0.3226
> > who_region.1European Region 1.8665 0.5612 3.3261
> 0.0010
> > who_region.1South-East Asian Region 2.6693 0.9360 2.8518
> 0.0046
> > who_region.1Western Pacific Region 1.4697 0.8381 1.7536
> 0.0803
> > ci.lb ci.ub
> > intrcpt -7.6141 -6.6511 ***
> > who_region.1African Region 1.9348 4.4248 ***
> > who_region.1Eastern Mediterranean Region -0.9401 2.8483
> > who_region.1European Region 0.7631 2.9698 ***
> > who_region.1South-East Asian Region 0.8290 4.5096 **
> > who_region.1Western Pacific Region -0.1781 3.1176 .
> >
> >
> > And exponentiating the coefficients, I got these rate ratios:
> >
> > Intrcpt (Americas) 0.000799
> > African Region 24.04225
> > Eastern Mediterranean Region 2.596419
> > European Region 6.465383
> > South-East Asian Region 14.42971
> > Western Pacific Region 4.348128
> >
> > Based on the incidence rates produced by using the entire dataset in the
> > model and first subsetting by region, these Rate Ratios don't seem to be
> > correct. Simply dividing the incidence rates by a comparator (Region of
> the
> > Americas) to produce rate ratios would give the following:
> > Entire Dataset First Subsetted
> > Region of the Americas
> > African Region 5.121411531 5.875294245
> > Eastern Mediterranean Region 0.744106788 0.767788539
> > European Region 2.485089463 2.365564921
> > South-East Asian Region 3.070434536 3.540304225
> > Western Pacific Region 1.604657768 1.725952561
> >
> >
> > *Q2) Is exponentiating the coefficients (measure = IRLN) the way to
> > calculate rate ratios? Why are these results so different?*
> >
> > *Confidence Intervals*
> >
> > To calculate the 95% confidence intervals for the rate ratios, I first
> > calculated the standard deviation (SD[ln(IR)] = (1/A1 + 1/A2)^0.5, where
> A1
> > and A2 are the number of tuberculosis cases in each region), and then
> used
> > the following: 95% CI's = exp[ln(IR) ± 1.96(SD)]).
> >
> > *Q3) Is this the right approach to calculating the confidence intervals?
> Is
> > there a way to calculate the confidence intervals from the model (either
> > measure = "IRLN" or measure = "IR") output?*
> >
> > *Prevalence:*
> > Additionally, we have data for calculating tuberculosis prevalence rate
> by
> > WHO region and are trying to calculate odds ratios. For prevalence rate,
> > all of the code for the model is the same, except the escalc function is
> as
> > follows:
> > pd_ec <- escalc(measure = 'PR', xi =
> data_sub$prev_positive,ni
> > = data_sub$prev_total_n, append = TRUE,
> > data = data_sub),
> >
> > *Q4) For prevalence rates, should we first subset the data to find each
> > region's prevalence rate, or use the entire dataset (adding intercept to
> > each region's slope)?*
> > *Q5)How would we calculate odds ratios using the model?*
> > *Q6)How would we calculate the confidence intervals for these odds ratios
> > using the model? *
> >
> > Thank you so much again for your advice. It's really been indispensable
> and
> > is greatly appreciated.
> >
> > All the best,
> > Olivia
> >
> > On Fri, Jul 26, 2019 at 1:44 PM Viechtbauer, Wolfgang (SP) <
> > wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >
> >> Hi Olivia,
> >>
> >> A couple comments before I get to your actual question:
> >>
> >> 1) You might want to divide the person-years by some factor. For
> example:
> >>
> >> ti = data$person_years / 1000
> >>
> >> The incidence rates then reflect the number of cases per 1000 person
> >> years. That way, the rates aren't so small and the same goes for the
> model
> >> coefficients (e.g., the 0.0027 will turn into 2.7, which is a bit
> easier to
> >> communicate).
> >>
> >> 2) You don't need to write pd_ec$yi and pd_ec$vi when you use
> data=pd_ec.
> >> So, rma.mv(yi, vi, ..., data=pd_ec) is sufficient.
> >>
> >> 3) You should include random effects for studies (as you have done) and
> >> also random effects for estimates within studies. So:
> >>
> >> pd_ec$id <- 1:nrow(pd_ec)
> >>
> >> random = ~ 1 | study_id / id
> >>
> >> In fact, I see that you may have multiple estimates for the same cohort,
> >> so one could even use an additional level, that is:
> >>
> >> random = ~ 1 | study_id / cohort_id / id
> >>
> >> But if you only have 8 rows of data, then I wouldn't do that (so then I
> >> would use random = ~ 1 | study_id / id).
> >>
> >> 4) Now for your actual question: You are analyzing incidence rates, so
> the
> >> coefficients of your model reflect how the (average) rate changes as a
> >> function of those moderators (e.g., while the rate is 0.0027 for
> >> "1Americas", the rate is 0.0027-0.0025=0.0002 for "Africa"). In other
> >> words, the coefficients represent incidence rate differences. If you
> want
> >> ratios, you should analyze log incidence rates (measure="IRLN"). Then
> the
> >> coefficients reflect differences between log incidence rates, so if you
> >> exponentiate the coefficients of your model, you get incidence rate
> ratios.
> >>
> >> Best,
> >> Wolfgang
> >>
> >> -----Original Message-----
> >> From: R-sig-meta-analysis [mailto:
> >> r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Olivia Cords
> >> Sent: Friday, 26 July, 2019 21:27
> >> To: r-sig-meta-analysis using r-project.org
> >> Subject: [R-meta] metafor package in R - Risk ratios using rma.mv()
> >>
> >> ATTACHMENT(S) REMOVED: metafor_output_0724 (1).png |
> 702_tb_incidence.xlsx
> >> | 702_metafor_tb_incidence (4).r
> >>
> >> Hello!
> >> My name is Olivia and I'm a researcher at Stanford University. Our group
> >> is trying to calculate relative risk ratios for the a meta-analysis
> >> extracting incidence rates of tuberculosis disease using the metafor
> >> package.
> >>
> >> Study Design:
> >> Incidence rates were extracted for each study identified by a systematic
> >> review, with some studies reporting multiple rates for different years
> or
> >> locations. In this case, multiple rates were treated as different study
> >> cohorts, meaning that the data is clustered ('cohort_id' nested within
> >> 'study_id').
> >>
> >> Model:
> >> We used the rma.mv() function, inputting the calculated incidence rate
> as
> >> the outcome variable ('pdc$yi), the variance ('pdc$vi'), WHO region
> >> ('who_region') and whether the study was conducted through passive or
> >> active screening (passive_active') as moderators, and a random effects
> >> argument for the study level ('study_id'). We are unclear how to go from
> >> the output to Risk Ratios.
> >>
> >> Data:
> >> study_id cohort_id n_diagnosed person_years who_region
> >> passive_active
> >> 131 34 77 14298 1Americas 1Passive
> >> 93 120 5 27750 1Americas 1Passive
> >> 93 121 14 277150 1Americas 1Passive
> >> 93 122 15 2000 1Americas 1Passive
> >> 136 383 2 2000 Africa 2Active
> >> 136 383 7 100000 Africa 2Active
> >> 187 16 16 100000 Africa 3Not
> >> Specified
> >> 187 517 2 100000 S.E. Asia 3Not
> >> Specified
> >>
> >> Code:
> >>
> >> library(xlsx)
> >> library(metafor)
> >>
> >> data <- read.xlsx('702_tb_incidence.xlsx', 1)
> >> print(data)
> >> #calculating the incidence rate and the variance
> >> pd_ec <- escalc(
> >> measure = 'IR',
> >> xi = data$n_diagnosed,
> >> ti = data$person_years,
> >> data = data
> >> )
> >> pd_ec
> >>
> >> #specifying mixed effects model
> >> #first level cohort incidence rate and variation
> >> #second level study_id
> >> #who_region and passive_active as moderators
> >> m0 <- rma.mv(pd_ec$yi, pd_ec$vi, method='REML', mods = ~ who_region +
> >> passive_active,
> >> random= ~ 1 | study_id,
> >> tdist=TRUE,
> >> data=pd_ec)
> >>
> >> summary(m0)
> >>
> >> Output:
> >> Multivariate Meta-Analysis Model (k = 8; method: REML)
> >>
> >> logLik Deviance AIC BIC AICc
> >> 15.4961 -30.9922 -20.9922 -24.0607 39.0078
> >>
> >> Variance Components:
> >>
> >> estim sqrt nlvls fixed factor
> >> sigma^2 0.0000 0.0037 4 no study_id
> >>
> >> Test for Residual Heterogeneity:
> >> QE(df = 4) = 94.4465, p-val < .0001
> >>
> >> Test of Moderators (coefficients 2:4):
> >> F(df1 = 3, df2 = 4) = 3.7931, p-val = 0.1152
> >>
> >> Model Results:
> >>
> >> estimate se tval pval ci.lb
> ci.ub
> >> intrcpt 0.0027 0.0027 1.0073 0.3708 -0.0047
> >> 0.0101
> >> who_regionAfrica -0.0025 0.0046 -0.5493 0.6120 -0.0153
> >> 0.0102
> >> who_regionS.E. Asia -0.0027 0.0046 -0.5797 0.5932 -0.0154
> >> 0.0101
> >> passive_active2Active -0.0001 0.0053 -0.0167 0.9874 -0.0148
> 0.0146
> >>
> >> Any advice/insight in much appreciated!
> >>
> >> Best wishes,
> >> Olivia
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-meta-analysis mailing list
> > R-sig-meta-analysis using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
> >
> > ---
> > This email has been checked for viruses by AVG.
> > https://www.avg.com
> >
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list