[R-meta] Including subsections of test and overall test results in rma.mv

Mon Apr 17 11:20:53 CEST 2023

Please see my responses below.

Best,
Wolfgang

>-----Original Message-----
>From: Yuhang Hu [mailto:yh342 using nau.edu]
>Sent: Monday, 17 April, 2023 4:37
>To: R meta
>Cc: Viechtbauer, Wolfgang (NP)
>Subject: Re: [R-meta] Including subsections of test and overall test results in
>rma.mv
>
>Thank you, Wolfgang, for your valuable comments. I have a quick follow-up on the
>following part of your comments:
>
>"However, you are not including the writing-reading, writing-speaking, and
>reading-speaking correlations in your dataset and the relationship is non-
>linear."
>
>Yes, not all studies have all the subsections of the test in them.

That wasn't my point. My point was that there are also correlations between the writing, reading, and speaking measures in studies that used/reported these subsections, but the dataset structure you showed does not include them. For example, for study 1, you showed:

study trait_scale test_outcome
1             epq      overall
1             epq      writing
1             epq      reading
1             epq     speaking

but in principle, the full 4x4 correlation matrix would lead to:

study        var1         var2    
1             epq      overall
1             epq      writing
1             epq      reading
1             epq     speaking
1         writing      reading
1         writing     speaking
1         reading     speaking

(irrespective of whether in the end you are only be interested in the size of the epq-<var2> correlations).

In this case, the epq-overall correlation could be reconstructed and in that sense is redundant. However, since you are not including those additional correlations, one cannot reconstruct the epq-overall correlation.

>But I wonder
>how could that indicate that the relationship between the subsections of the test
>and their respective overall in each study is non-linear (btw, by relationship,
>we really mean the sampling distribution of the subsections and that of their
>overall tests are correlated, either linearly or non-linearly, right?)?

Not sure what you mean by that. The epq-overall correlation is equal to a (non-linear) function of the correlations in the 4x4 correlation matrix (see my previous reply). This has nothing to do with sampling distributions. It is simply:

r_epq_overall = (r_epq_writing + r_epq_reading + r_epq_speaking) / 
                sqrt(3 + 2*r_writing_reading + 2*r_writing_speaking + 2*r_reading_speaking)

>Related to that is the fact that since studies differ in their subsections of the
>test, their respective overalls, thus, don't mean the same thing across the
>studies. So, the category "overall" doesn't seem to be a useful addition to the
>outcome variable insofar as the fixed effect (and correlated random-effects) of
>variable outcome is of interest.

Can't quite follow this either. Before, you said that you were specifically interested in "exploring the relationship mentioned above <<both>> in terms of the overall achievement test outcome as well as the subsections of the test outcomes." But now it sounds like you changed your mind.

But in any case, I don't have any further thoughts on whether including both the overall and the subsection correlations in the same analysis really makes sense. Again though, while computing something like cov(r_epq_overall, r_epq_writing) for the V matrix is possible, it will involve using the multivariate delta method and will require knowing the full 4x4 correlation matrix.

>Thank you,
>Yuhang
>
>On Sun, Apr 16, 2023 at 4:37 AM Viechtbauer, Wolfgang (NP)
><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>Dear Yuhang,
>
>Interesting question. I cannot give you a direct answer, but just some thoughts:
>
>The 'overall' correlation is a so-called 'composite correlation' that can be
>reconstructed from the 4x4 correlation matrix for writing, reading, speaking, and
>whatever other variable these variables are being correlated with. For example,
>say that you have the following correlation matrix:
>
>R <- structure(c(1, 0.4, 0.27, 0.27, 0.4, 1, 0.22, 0.54, 0.27,
>0.22, 1, 0.56, 0.27, 0.54, 0.56, 1), dim = c(4L, 4L))
>rownames(R) <- colnames(R) <- c("writing", "reading", "speaking", "other")
>R
>
>Then the correlation between the sum (or mean) of the standardized writing,
>reading, and speaking variables with the "other variable" can be computed, for
>example, with the composite_r_matrix() function from the 'psychmeta' package:
>
>library(psychmeta)
>composite_r_matrix(R, 1:3, 4)
>
>Or one can do this manually with:
>
>sum(R[4,1:3]) / sqrt(sum(R[1:3,1:3]) * sum(R[4,4]))
>
>So, there is a direct 'functional' relationship between the individual
>correlations and the overall one and in that sense, one might argue that
>including the individual correlations and the overall one is redundant. However,
>you are not including the writing-reading, writing-speaking, and reading-speaking
>correlations in your dataset and the relationship is non-linear. So in that
>sense, one might argue that including both sets is permissible. However, when
>doing so, it is important that one gets the covariance between all these
>correlations correct in the V matrix. You say that your V matrix captures those
>covariances, but I would be curious how you computed those covariances. Given the
>relationship above, it is of course possible to compute those covariances, but
>this doesn't seem entirely trivial to me.
>
>Best,
>Wolfgang
>
>>-----Original Message-----
>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>>Behalf Of Yuhang Hu via R-sig-meta-analysis
>>Sent: Saturday, 08 April, 2023 5:39
>>To: R meta
>>Cc: Yuhang Hu
>>Subject: [R-meta] Including subsections of test and overall test results in
>>rma.mv
>>
>>Hello Meta Experts,
>>
>>I'm exploring the relation between a personality trait and an
>>achievement test outcome across a set of studies.
>>
>>Some studies report the relation of the trait with both the overall
>>achievement test outcome (one correlation) as well as the subsections of
>>the test outcomes (multiple correlations).
>>
>>I'm interested in exploring the relationship mentioned above <<both>> in
>>terms of the overall achievement test outcome as well as the subsections of
>>the test outcomes.
>>
>>So, my current data looks like what I'm showing below.
>>
>>I do have a V matrix in my model that correlates the correlation coefs in
>>each study due to the same subjects taking the subsections of the test
>>outcomes <<as well as>> the overall test outcome.
>>
>>*My question is that: given my V matrix, is it fine if I include both the
>>subsections of the test as well as the overall test outcomes in the same
>>model?*
>>
>>(My hunch is that this is not permissible because in the current model the
>>overall test outcome is essentially treated as a new outcome while the
>>overall test outcome essentially subsumes the subsections of the test
>>outcomes, not a new outcome.)
>>
>>rma.mv(r2z~test_outcome, V, random=~1 | trait_scale/study/test_outcome/es)
>>
>>study trait_scale test_outcome  r2z   v_r2z  es
>>1         epq           overall
>>1         epq           writing
>>1         epq           reading
>>1         epq           speaking
>>2         16pf          overall
>>3         epi            writing
>>3         epi            speaking
>>
>>Thank you,
>>Yuhang