[R-meta] Do the results of rma.mv() depend on how dataframe rows are ordered/arranged?
Gabriele Midolo
g@br|e|e@m|do|o @end|ng |rom gm@||@com
Thu Aug 22 16:55:32 CEST 2019
Dear James,
So, concerning the code I have put above, one simply need to sort the
ctrl_id with ascending order before computing V?
Something like:
data1<-data1%>%arrange(ctrl_id) # ...using dplyr
That way the dataframe should be ordered in the same way as
split(data1$vi_Bracken,data1$ctrl_id)...?
Thanks for this hint btw - very appreciated.
Gabri.
On Thu, 22 Aug 2019 at 15:58, James Pustejovsky <jepusto using gmail.com> wrote:
> Gabriele,
>
> The problem is that the variance matrix gets re-ordered when you do
> split() %>% lapply() %>% bldiag(). Try running the following to see the
> issue:
>
> # data frame, not sorted by ID
> dat <- data.frame(ID = rep(LETTERS[1:3], 4), v = 1:12)
> split(dat$v, dat$ID)
> bldiag(lapply(split(dat$v, dat$ID), diag, nrow = 4))
>
> To use the split() %>% lapply() %>% bldiag() technique and have the matrix
> come out so that the entries of the V matrix correspond to the rows of the
> data, the data needs to be sorted by the same ID variable that is used in
> split().
>
> James
>
> On Thu, Aug 22, 2019 at 7:30 AM Gabriele Midolo <gabriele.midolo using gmail.com>
> wrote:
>
>> Dear Michael,
>>
>> That shouldn't be the issue, because I re-compute V after I sorted the
>> rows and before I fitted the model (not showed in my previous email).
>> Please find attached a minimal data-set and reproducible R script
>> illustrating what is going on...
>>
>> Thank you very much,
>> Gabri.
>>
>> PS: yi and vi_Bracken are the effect size and sampling variance of
>> log-transformed response ratios obtained via escalc(), respectively.
>>
>> On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists using dewey.myzen.co.uk>
>> wrote:
>>
>>> Dear Gabriele
>>>
>>> In the code you supply it appears you re-ordered the data but not the V
>>> matrix. Is that the issue? If not can you post a reproducible example
>>> with a minimal data-set and, ideally, using just plain R to manipulate
>>> the data-set so we can see more clearly what the code does.
>>>
>>> Michael
>>>
>>> On 22/08/2019 10:29, Gabriele Midolo wrote:
>>> > Dear all,
>>> >
>>> > I noticed that the results of my rma.mv() model strongly changes when
>>> the
>>> > dataframe rows are arranged in a different order. This is new to me
>>> and I
>>> > can’t really understand why. I thought it was an issue related to how
>>> the V
>>> > matrix get computed (?), but they are actually identical independently
>>> from
>>> > rows order in the dataframe… how is a dataframe supposed to be arranged
>>> > before you can trust rma.mv() results then? Thanks.
>>> >
>>> >
>>> > EXAMPLE:
>>> >
>>> > calc.v <- function(x) {
>>> >
>>> > v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) ,
>>> nrow=nrow(x),
>>> > ncol=nrow(x))
>>> >
>>> > diag(v) <- x$vi_Bracken
>>> >
>>> > v
>>> >
>>> > }
>>> >
>>> > random = ~ 1 | study/ID
>>> >
>>> >
>>> > If I run the following:
>>> >
>>> >
>>> >
>>> > a<-df%>%filter(TRAIT=="LA")
>>> >
>>> > dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()
>>> >
>>> > dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method =
>>> "Bracken1992")
>>> >
>>> > V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
>>> >
>>> > m<-rma.mv(yi~dTf,V,data=dat,random = random)
>>> >
>>> > m
>>> >
>>> >
>>> >
>>> > Multivariate Meta-Analysis Model (k = 34; method: REML)
>>> >
>>> >
>>> >
>>> > Variance Components:
>>> >
>>> >
>>> >
>>> > estim sqrt nlvls fixed factor
>>> >
>>> > sigma^2.1 0.0000 0.0000 7 no study
>>> >
>>> > sigma^2.2 0.0607 0.2464 34 no study/ID
>>> >
>>> >
>>> >
>>> > Test for Residual Heterogeneity:
>>> >
>>> > QE(df = 32) = 1024.8720, p-val < .0001
>>> >
>>> >
>>> >
>>> > Test of Moderators (coefficient 2):
>>> >
>>> > QM(df = 1) = 1.6018, p-val = 0.2056
>>> >
>>> >
>>> >
>>> > Model Results:
>>> >
>>> >
>>> >
>>> > estimate se zval pval ci.lb ci.ub
>>> >
>>> > intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
>>> >
>>> > dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
>>> >
>>> >
>>> >
>>> > If I run the same code but arrange the dataframe by e.g. the ID of the
>>> > observation via dplyr::arrange(),
>>> >
>>> > a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
>>> >
>>> >
>>> > Then I get the following output… i.e. a completely different result:
>>> >
>>> > Multivariate Meta-Analysis Model (k = 34; method: REML)
>>> >
>>> >
>>> >
>>> > Variance Components:
>>> >
>>> >
>>> >
>>> > estim sqrt nlvls fixed factor
>>> >
>>> > sigma^2.1 0.0000 0.0000 7 no study
>>> >
>>> > sigma^2.2 0.0336 0.1834 34 no study/ID
>>> >
>>> >
>>> >
>>> > Test for Residual Heterogeneity:
>>> >
>>> > QE(df = 32) = 334.3701, p-val < .0001
>>> >
>>> >
>>> >
>>> > Test of Moderators (coefficient 2):
>>> >
>>> > QM(df = 1) = 13.9681, p-val = 0.0002
>>> >
>>> >
>>> >
>>> > Model Results:
>>> >
>>> >
>>> >
>>> > estimate se zval pval ci.lb ci.ub
>>> >
>>> > intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
>>> >
>>> > dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > R-sig-meta-analysis mailing list
>>> > R-sig-meta-analysis using r-project.org
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>> >
>>> > ---
>>> > This email has been checked for viruses by AVG.
>>> > https://www.avg.com
>>> >
>>> >
>>>
>>> --
>>> Michael
>>> http://www.dewey.myzen.co.uk/home.html
>>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list
>> R-sig-meta-analysis using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list