[R-meta] extracting variances

Tue Aug 18 12:08:14 CEST 2020

Dear Gil,

1) One can indeed compute derived variances based on the H (and G) matrix in this way. 

Whether this reflects response variance over time - not so sure. I started to think about this, but it started to take too much time to reason through this, so I am skipping this one.

2) Not sure what you mean by 'residual variance'. In meta-analytic models, each estimate has its own sampling variance and that could be called the residual variance. This isn't shown in the output though. dat$vi will show you those variances. There isn't just one residual variance though - each estimate has its own variance. Or do you mean the variance in the underlying true outcomes at the estimate level? That would be sigma^2.2 in your model.

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Gram, Gil (IITA)
>Sent: Tuesday, 18 August, 2020 10:59
>To: r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] extracting variances
>
>Dear Wolfgang,
>
>As always, thanks for your reply.
>
>Even in effect size units, the response variance 0.00775 is extremely small.
>Otherwise last questions,
>
>  1.  does the approach seem correct to you? (extracting the vcov matrix H
>in order to compute the Control, MR, OR, and ORMR variance responses over
>time)
>  2.  where can I find the residual variance in the output below?
>
>Multivariate Meta-Analysis Model (k = 2943; method: REML)
>
>   logLik   Deviance        AIC        BIC       AICc
>-342.3623   684.7246   752.7246   956.0338   753.5493
>
>Variance Components:
>
>            estim    sqrt  nlvls  fixed  factor
>sigma^2.1  0.0924  0.3040     40     no     ref
>sigma^2.2  0.0251  0.1583   2943     no   idRow
>
>outer factor: idSite    (nlvls = 71)
>inner factor: treatment (nlvls = 4)
>
>            estim    sqrt  k.lvl  fixed    level
>tau^2.1    0.1865  0.4319    275     no  Control
>tau^2.2    0.0922  0.3036    374     no       MR
>tau^2.3    0.1035  0.3217   1039     no       OR
>tau^2.4    0.0640  0.2530   1255     no     ORMR
>rho        0.8154                    no
>
>outer factor: idSite.time (nlvls = 271)
>inner factor: treatment   (nlvls = 4)
>
>              estim    sqrt  k.lvl  fixed    level
>gamma^2.1    0.1072  0.3274    275     no  Control
>gamma^2.2    0.1429  0.3780    374     no       MR
>gamma^2.3    0.1129  0.3360   1039     no       OR
>gamma^2.4    0.1405  0.3748   1255     no     ORMR
>phi          0.9435                    no
>
>Test for Residual Heterogeneity:
>QE(df = 2921) = 1068664.4274, p-val < .0001
>
>Test of Moderators (coefficients 2:22):
>QM(df = 21) = 1259.6581, p-val < .0001
>
>Thanks and have a good day!
>Gil
>
>Message: 1
>Date: Thu, 6 Aug 2020 12:07:57 +0000
>From: "Viechtbauer, Wolfgang (SP)"
><wolfgang.viechtbauer using maastrichtuniversity.nl<mailto:wolfgang.viechtbauer using ma
>astrichtuniversity.nl>>
>To: "Gram, Gil (IITA)" <G.Gram using cgiar.org<mailto:G.Gram using cgiar.org>>,
>"r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>"
><r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>>
>Subject: Re: [R-meta] extracting variances
>Message-ID: <da3972b28cf14339b934c848f40a7c30 using UM-
>MAIL3214.unimaas.nl<mailto:da3972b28cf14339b934c848f40a7c30 using UM-
>MAIL3214.unimaas.nl>>
>Content-Type: text/plain; charset="utf-8"
>
>Dear Gil,
>
>You seem to interpret 0.00775 as 0.77% but the variances (or contrasts
>thereof) are not percentages. They are variances (in the units of whatever
>effect size / outcome measure you are using).
>
>Best,
>Wolfgang
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Gram, Gil (IITA)
>Sent: Tuesday, 14 July, 2020 11:38
>To: r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>
>Subject: [R-meta] extracting variances
>
>Dear all,
>
>I have a question regarding extracting the variances from my model.
>
>Say I want to analyse the yields (tonnes per hectare) of 4 treatments
>(control, OR, MR, ORMR) running across different sites and times. A
>simplified version of my model would then be:
>
>dat = escalc(measure="MN", mi=yield, sdi=sdYield, ni=nRep, data=temp)
>dat$yi = sqrt(dat$yi) # sqrt transformation
>dat$vi = dat$vi/(4*dat$yi) # variance adjustment to sqrt transformation
>
>mod = rma.mv(yi, as.matrix(vi), method = 'REML', struct="HCS", sparse=TRUE,
>data=dat,
>                             mods = ~ rateORone + kgMN + I(rateORone^2) +
>I(kgMN^2)
>                             + rateORone:kgMN + I(rateORone^2):I(kgMN^2) +
>[…],
>                             random = list(~1|ref, ~1|idRow, ~ treatment |
>idSite, ~ treatment | idSite.time))
>
>
>I’m interested in the yield variance responses over time, of OR and ORMR
>versus control. So I extract the variance-covariance matrix H = mod$H:
>
>        Control        MR        OR      ORMR
>Control 0.1098190 0.1179042 0.1055471 0.1216751
>MR      0.1179042 0.1360579 0.1174815 0.1354332
>OR      0.1055471 0.1174815 0.1090329 0.1212389
>ORMR    0.1216751 0.1354332 0.1212389 0.1449001
>
>The variance responses I then calculate with e.g. responseOR = varianceOR +
>varianceControl - 2*covar(OR, Control):
>
>resOR
>= (H['OR','OR'] + H['Control','Control'] - 2*H['Control','OR'])
>= 0.1090329 + 0.1098190 - 2* 0.1055471
>~ 0.00775
>
>resORMR
>~ 0.0114
>
>I understand therefore that the variance responses over time for treatments
>OR and ORMR are about 0.77% and 1.1%. These values are extremely small,
>hence my questions:
>
>- Am I correct that this was the correct way to estimate the yield
>variability (responses) over time?
>
>If this is all correct, then this means that there is hardly any variability
>associated with these components. And one could start wondering what the
>point is of even looking at this. I tried looking at the values of the other
>components, and see whether these are larger.
>
>- Keeping in mind the original data was sqrt transformed, can these values
>still be considered as variances? or as standard deviations instead?
>- If this makes up so little variance, then where is the variance coming
>from? How much variability is associated with the error term? Or the other
>components. Are these then magnitudes larger? How do I check if the sum of
>all variance components equals 100% with the model output below?
>
>I hope my questions are clear…
>
>Thanks a lot in advance for your help,
>
>Gil
>
>------
>
>Multivariate Meta-Analysis Model (k = 1161; method: REML)
>
>Variance Components:
>
>          estim    sqrt  nlvls  fixed  factor
>sigma^2.1  0.0604  0.2458     40     no     ref
>sigma^2.2  0.0285  0.1688   1161     no   idRow
>
>outer factor: idSite    (nlvls = 71)
>inner factor: treatment (nlvls = 4)
>
>          estim    sqrt  k.lvl  fixed    level
>tau^2.1    0.1285  0.3584    275     no  Control
>tau^2.2    0.0952  0.3086    374     no       MR
>tau^2.3    0.1217  0.3488    234     no       OR
>tau^2.4    0.0711  0.2666    278     no     ORMR
>rho        0.7172                    no
>
>outer factor: idSite.time (nlvls = 271)
>inner factor: treatment   (nlvls = 4)
>
>            estim    sqrt  k.lvl  fixed    level
>gamma^2.1    0.1098  0.3314    275     no  Control
>gamma^2.2    0.1361  0.3689    374     no       MR
>gamma^2.3    0.1090  0.3302    234     no       OR
>gamma^2.4    0.1449  0.3807    278     no     ORMR
>phi          0.9646                    no
>
>Test for Residual Heterogeneity:
>QE(df = 1151) = 501266.0717, p-val < .0001
>
>Test of Moderators (coefficients 2:10):
>QM(df = 9) = 441.0373, p-val < .0001
>
>Model Results:
>
>                        estimate      se     zval    pval    ci.lb
>ci.ub
>intrcpt                     1.2855  0.0691  18.6010  <.0001   1.1501
>1.4210  ***
>rateORone                   0.0059  0.0007   8.5224  <.0001   0.0045
>0.0072  ***
>kgMN                        0.0096  0.0009  10.5108  <.0001   0.0078
>0.0114  ***
>I(rateORone^2)             -0.0000  0.0000  -5.2103  <.0001  -0.0000  -
>0.0000  ***
>I(kgMN^2)                  -0.0000  0.0000  -6.6753  <.0001  -0.0000  -
>0.0000  ***
>[…]
>rateORone:kgMN             -0.0000  0.0000  -3.7035  0.0002  -0.0000  -
>0.0000  ***
>I(rateORone^2):I(kgMN^2)    0.0000  0.0000   2.5775  0.0100   0.0000
>0.0000   **
>
>
>------------------------------
>
>Message: 2
>Date: Thu, 6 Aug 2020 12:22:49 +0000
>From: "Viechtbauer, Wolfgang (SP)"
><wolfgang.viechtbauer using maastrichtuniversity.nl<mailto:wolfgang.viechtbauer using ma
>astrichtuniversity.nl>>
>To: Thao Tran <thaobrawn using gmail.com<mailto:thaobrawn using gmail.com>>,
>"r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>"
><r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>>
>Subject: Re: [R-meta] Correction for sample overlap in a meta-analysis
>of prevalence
>Message-ID: <4e330b7a5b1a4fddb65030b4f938638c using UM-
>MAIL3214.unimaas.nl<mailto:4e330b7a5b1a4fddb65030b4f938638c using UM-
>MAIL3214.unimaas.nl>>
>Content-Type: text/plain; charset="utf-8"
>
>Dear Thao,
>
>I do not know these papers, so I cannot comment on what methods they
>describe and whether those could be implemented using metafor.
>
>Obviously, the degree of dependence between overlapping estimates depends on
>the degree of overlap. Say there are two diseases (as in your example). Then
>if we had the raw data, we could count the number of individuals that:
>
>x1:  have only disease 1
>x2:  have only disease 2
>x12: have both disease 1 and 2
>x0:  have neither disease
>
>Let n = x1 + x2 + x12 + x0. Then you have p1 = (x1+x12) / n and p2 =
>(x2+x12) / n as the two prevalences. One could easily work out the
>covariance (I am too lazy to do that right now), but in the end this won't
>help, because computing this will require knowing all the x's, not just p1
>and p2 and n. And I assume no information is reported on the degree of
>overlap. One could maybe make some reasonable 'guestimates' and then compute
>the covariances followed by a sensitivity analysis.
>
>Alternatively, you could use the 'sandwich' method (cluster-robust
>inference). This has been discussed on this mailing list extensively in the
>past (not in the context of overlap in such estimates, but the principle is
>all the same).
>
>Best,
>Wolfgang
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Thao Tran
>Sent: Tuesday, 04 August, 2020 15:26
>To: r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>
>Subject: [R-meta] Correction for sample overlap in a meta-analysis of
>prevalence
>
>Dear all,
>
>I want to conduct a meta-analysis of around 30 studies (from a systematic
>review).
>
>Some background of the studies: The quantity of interest is the prevalence
>of RSV infection. Different studies reported RSV prevalence for different
>risk groups. Since, it is quite often that some people might suffer from
>multiple comorbidities (for example, an individual might have both cardiac
>disease and lung disease), and it was not stated clearly in the reported
>data if these two sub-populations (cardia disease patients, and lung
>disease patients) are mutually exclusive. In the end, I want to have an
>overall estimate across all risk groups. Given the fact stated above, it is
>likely that some of the data (from two or more risk groups) might share a
>proportion of the population. For example, John's study reported data on
>cardiac disease as well as lung disease. These two risk groups were
>included in the meta-analysis. However, we need to take into account the
>fact that, the two sub-populations might share some proportions of
>participants.
>
>I was searching on the internet methods to account for the overlap samples
>while conducting meta-analysis. There are two papers that address this
>problem:
>
> 1. https://academic.oup.com/bioinformatics/article/33/24/3947/3980249 The
> authors proposed FOLD, a method to optimize power in a meta-analysis of
> genetic associations studies with overlapping subjects.
> 2.
> http://www.stiftung.at/wp-content/uploads/2015/04/BomPaper_Oct_2014.pdf
>In
> this paper, the author compared generalized weights and inverse-variance
> weights meta-estimates to account for overlap sample.
>
>My question is:
>
>Are these approaches incorporated into the *metafor* package?
>Thanks for your input.
>Best,
>
>Thao
>--
>*Trần Mai Phương Thảo*
>Master Student - Master of Statistics
>Hasselt University - Belgium.
>Email: Thaobrawn using gmail.com<mailto:Thaobrawn using gmail.com> /
>maiphuongthao.tran using student.uhasselt.be<mailto:maiphuongthao.tran using student.uha
>sselt.be>
>Phone number: + 84 979 397 410+ 84 979 397 410 / 0032 488 0358430032 488
>035843
>
>
>------------------------------
>
>Message: 3
>Date: Thu, 6 Aug 2020 12:29:48 +0000
>From: "Viechtbauer, Wolfgang (SP)"
><wolfgang.viechtbauer using maastrichtuniversity.nl<mailto:wolfgang.viechtbauer using ma
>astrichtuniversity.nl>>
>To: Tzlil Shushan <tzlil21092 using gmail.com<mailto:tzlil21092 using gmail.com>>,
>"r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>"
><r-sig-meta-analysis using r-project.org<mailto:r-sig-meta-analysis using r-
>project.org>>
>Subject: Re: [R-meta] Performing a multilevel meta-analysis
>Message-ID: <024f9bc096534c129decb63836a59f1f using UM-
>MAIL3214.unimaas.nl<mailto:024f9bc096534c129decb63836a59f1f using UM-
>MAIL3214.unimaas.nl>>
>Content-Type: text/plain; charset="utf-8"
>
>Dear Tzlil,
>
>Unless you have good reasons to do so, do not use custom weights. rma.mv()
>uses weights and the default ones are usually fine.
>
>weights(res, type="rowsum") will only (currently) work in the 'devel'
>version of metafor, which you can install as described here:
>
>https://wviechtb.github.io/metafor/#installation
>
>I can't really comment on the second question, because answering this would
>require knowing all details of what is being computed/reported.
>
>As for the last question ("is there a straightforward way in metafor to
>specify the analysis with Chi-square values"): No, chi-square values are
>test statistics, not an effect size / outcome measure, so they cannot be
>used for a meta-analysis (at least not with metafor).
>
>Best,
>Wolfgang
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Tzlil Shushan
>Sent: Wednesday, 05 August, 2020 5:45
>To: r-sig-meta-analysis using r-project.org
>Subject: [R-meta] Performing a multilevel meta-analysis
>
>Hi R legends!
>
>My name is Tzlil and I'm a PhD candidate in Sport Science - Human
>performance science and sports analytics
>
>I'm currently working on a  multilevel meta-analysis using the metafor
>package.
>
>My first question is around the methods used to assign weights within rma.mv
>models.
>
>I'd like to know if there is a conventional or 'most conservative' approach
>to continue with. Since I haven't found a consistent methodology within the
>multilevel meta-analyses papers I read, I originally applied a weight which
>pertains to variance (vi) and number of effect sizes from the same study. I
>found this method in a lecture by Joshua R. Polanin
>https://www.youtube.com/watch?v=rJjeRRf23L8&t=1719s from 28:00.
>
>W = 1/vi, then divided by the number of ES for a study
>for example, a study with vi = 0.0402 and 2 different ES will weight as
>follow;
>1/0.0402 = 24.88, then 24.88/2 = 12.44 (finally, converting into
>percentages based on the overall weights in the analysis)
>
>After I've read some of the great posts provided in last threads here such
>as;
>http://www.metafor-project.org/doku.php/tips:weights_in_rma.mv_models and
>https://www.jepusto.com/weighting-in-multivariate-meta-analysis/
>I wonder if it is not correct and I need to modify the way I use weights in
>my model..
>
>For some reason, I tried to imitate the approach used in the first link
>above. However, for some reason I get an error every time I tried to
>specify weights(res, type="rowsum") *Error in match.arg(type, c("diagonal",
>"matrix")) : 'arg' should be one of “diagonal”, “matrix”*
>
>My second question is related to the way I meta-analyse a specific ES. My
>meta-analysis involves the reliability and convergent validity of heart
>rate during a specific task, which is measured in relative values (i.e.
>percentages). Therefore, my meta-analysis includes four different ESs
>parameters (mean difference; MD, interclass correlation; ICC, standard
>error of measurement; SEM, and correlation coefficient; r).
>
>I wonder how I need to use SEM before starting the analysis. I've seen some
>papers which squared and log transformed the SEM before performing a
>meta-analysis, while others converted the SEM into CV%. Due to the original
>scale of our ES (which is already in percentages) I'd like to perform the
>analysis without converting it into CV% values. Should I use the SEM as the
>reported values? only log transformed it? Further, is there a
>straightforward way  in metafor to specify the analysis with Chi-square
>values (as "ZCOR" in correlations)?
>
>Thanks in advance!
>
>Kind regards,
>
>Tzlil Shushan | Sport Scientist, Physical Preparation Coach
>
>BEd Physical Education and Exercise Science
>MSc Exercise Science - High Performance Sports: Strength &
>Conditioning, CSCS
>PhD Candidate Human Performance Science & Sports Analytics
>
>
>------------------------------
>
>Subject: Digest Footer
>
>_______________________________________________
>R-sig-meta-analysis mailing list
>R-sig-meta-analysis using r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
>
>------------------------------
>
>End of R-sig-meta-analysis Digest, Vol 39, Issue 2
>**************************************************
>
>
>	[[alternative HTML version deleted]]
>
>_______________________________________________
>R-sig-meta-analysis mailing list
>R-sig-meta-analysis using r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis