[R-meta] Do the results of rma.mv() depend on how dataframe rows are ordered/arranged?

Thu Aug 22 15:58:40 CEST 2019

Gabriele,

The problem is that the variance matrix gets re-ordered when you do split()
%>% lapply() %>% bldiag(). Try running the following to see the issue:

# data frame, not sorted by ID
dat <- data.frame(ID = rep(LETTERS[1:3], 4), v = 1:12)
split(dat$v, dat$ID)
bldiag(lapply(split(dat$v, dat$ID), diag, nrow = 4))

To use the split() %>% lapply() %>% bldiag() technique and have the matrix
come out so that the entries of the V matrix correspond to the rows of the
data, the data needs to be sorted by the same ID variable that is used in
split().

James

On Thu, Aug 22, 2019 at 7:30 AM Gabriele Midolo <gabriele.midolo using gmail.com>
wrote:

> Dear Michael,
>
> That shouldn't be the issue, because I re-compute V after I sorted the
> rows and before I fitted the model (not showed in my previous email).
> Please find attached a minimal data-set and reproducible R script
> illustrating what is going on...
>
> Thank you very much,
> Gabri.
>
> PS: yi and vi_Bracken are the effect size and sampling variance of
> log-transformed response ratios obtained via escalc(), respectively.
>
> On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists using dewey.myzen.co.uk>
> wrote:
>
>> Dear Gabriele
>>
>> In the code you supply it appears you re-ordered the data but not the V
>> matrix. Is that the issue? If not can you post a reproducible example
>> with a minimal data-set and, ideally, using just plain R to manipulate
>> the data-set so we can see more clearly what the code does.
>>
>> Michael
>>
>> On 22/08/2019 10:29, Gabriele Midolo wrote:
>> > Dear all,
>> >
>> > I noticed that the results of my rma.mv() model strongly changes when
>> the
>> > dataframe rows are arranged in a different order. This is new to me and
>> I
>> > can’t really understand why. I thought it was an issue related to how
>> the V
>> > matrix get computed (?), but they are actually identical independently
>> from
>> > rows order in the dataframe… how is a dataframe supposed to be arranged
>> > before you can trust rma.mv() results then? Thanks.
>> >
>> >
>> > EXAMPLE:
>> >
>> > calc.v <- function(x) {
>> >
>> >    v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) ,
>> nrow=nrow(x),
>> > ncol=nrow(x))
>> >
>> >    diag(v) <- x$vi_Bracken
>> >
>> >    v
>> >
>> > }
>> >
>> > random = ~ 1 | study/ID
>> >
>> >
>> > If I run the following:
>> >
>> >
>> >
>> > a<-df%>%filter(TRAIT=="LA")
>> >
>> > dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()
>> >
>> > dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method =
>> "Bracken1992")
>> >
>> > V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
>> >
>> > m<-rma.mv(yi~dTf,V,data=dat,random = random)
>> >
>> > m
>> >
>> >
>> >
>> > Multivariate Meta-Analysis Model (k = 34; method: REML)
>> >
>> >
>> >
>> > Variance Components:
>> >
>> >
>> >
>> >              estim    sqrt  nlvls  fixed    factor
>> >
>> > sigma^2.1  0.0000  0.0000      7     no     study
>> >
>> > sigma^2.2  0.0607  0.2464     34     no  study/ID
>> >
>> >
>> >
>> > Test for Residual Heterogeneity:
>> >
>> > QE(df = 32) = 1024.8720, p-val < .0001
>> >
>> >
>> >
>> > Test of Moderators (coefficient 2):
>> >
>> > QM(df = 1) = 1.6018, p-val = 0.2056
>> >
>> >
>> >
>> > Model Results:
>> >
>> >
>> >
>> >           estimate      se     zval    pval    ci.lb   ci.ub
>> >
>> > intrcpt   -0.0786  0.0538  -1.4617  0.1438  -0.1841  0.0268
>> >
>> > dTf       -0.0233  0.0184  -1.2656  0.2056  -0.0593  0.0128
>> >
>> >
>> >
>> > If I run the same code but arrange the dataframe by e.g. the ID of the
>> > observation via dplyr::arrange(),
>> >
>> > a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
>> >
>> >
>> > Then I get the following output… i.e. a completely different result:
>> >
>> > Multivariate Meta-Analysis Model (k = 34; method: REML)
>> >
>> >
>> >
>> > Variance Components:
>> >
>> >
>> >
>> >              estim    sqrt  nlvls  fixed    factor
>> >
>> > sigma^2.1  0.0000  0.0000      7     no     study
>> >
>> > sigma^2.2  0.0336  0.1834     34     no  study/ID
>> >
>> >
>> >
>> > Test for Residual Heterogeneity:
>> >
>> > QE(df = 32) = 334.3701, p-val < .0001
>> >
>> >
>> >
>> > Test of Moderators (coefficient 2):
>> >
>> > QM(df = 1) = 13.9681, p-val = 0.0002
>> >
>> >
>> >
>> > Model Results:
>> >
>> >
>> >
>> >           estimate      se     zval    pval    ci.lb    ci.ub
>> >
>> > intrcpt   -0.0740  0.0416  -1.7808  0.0749  -0.1554   0.0074    .
>> >
>> > dTf       -0.0668  0.0179  -3.7374  0.0002  -0.1019  -0.0318  ***
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-meta-analysis mailing list
>> > R-sig-meta-analysis using r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>> >
>> > ---
>> > This email has been checked for viruses by AVG.
>> > https://www.avg.com
>> >
>> >
>>
>> --
>> Michael
>> http://www.dewey.myzen.co.uk/home.html
>>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]