[R-meta] mean-variance relationships introduces additional heterogeneity, how?

Tue Nov 2 21:30:35 CET 2021

Hi James,

That clears it up, thank you so much! This may be a can of worms (so I
only expect your general reflections), can LRR also be preferred over
SMD for a couple of other count-based scenarios where the data
generating process for the each subject's overall test response can be
one of:

(A) Multinomial distribution
(B) Ordered categorical distribution

For (A), imagine students in each paper were given a test to circle
errors out of N underlined words in each of T # of sentences (thus,
for each sentence, a subject's response may follow a categorical
distribution, right?
https://en.wikipedia.org/wiki/Categorical_distribution).

For (B), imagine students in each paper were given a test with T # of
sentences and asked how accurate each sentence looked: completely
inaccurate (-2), somewhat inaccurate (-1), unable to judge (0),
somewhat accurate (1), completely accurate (2).

In both cases, each group's performance is summarized by its mean and
sd in the papers (again, for simplicity, let's imagine a
one-effect-size-per-study case).

Thanks,
Luke
(ps. It actually may be the case that these two new scenarios happen
in the same meta-analysis that includes other papers where subjects'
overall test responses are binomially distributed. So, an effect size
parameter invariant/less sensitive to these data generation processes
is desperately needed.)

On Tue, Nov 2, 2021 at 10:31 AM James Pustejovsky <jepusto using gmail.com> wrote:
>
> Hi Luke,
>
> Sure. I mean that the best-fit line is something like
>
> mu-B = beta0 + beta1 mu-A
>
> But if beta0 = 0, then mu-B = beta1 mu-A, or
>
> beta1 = mu-B / mu-A,
>
> so the two means are proportionally related, which is what the
> response ratio metric describes.
>
> On the other hand, if we had a non-zero beta0 but had beta-1 = 1, then
> mu-B = beta0 + mu-A, or
>
> beta0 = mu-B - mu-A,
>
> so the two means differ by a constant, which is what the risk
> difference metric (or difference-in-proportions) describes.
>
> James
>
> On Tue, Nov 2, 2021 at 10:19 AM Luke Martinez <martinezlukerm using gmail.com> wrote:
> >
> > Hi James,
> >
> > Thanks a lot for investing so much effort into my question! Let me ask
> > a quick question regarding the second diagnostic in your post.
> >
> > In your post, you note that *"[Since] there is a strong linear
> > relationship between the two [groups'] means, with a best-fit line
> > that might go through the origin. . . the response ratio might be an
> > appropriate metric."*
> >
> > Could you please elaborate on how this speaks to the appropriateness
> > of LRR over SMD?
> >
> > Luke
> >
> > On Tue, Nov 2, 2021 at 8:12 AM James Pustejovsky <jepusto using gmail.com> wrote:
> > >
> > > HI Luke and listserv,
> > >
> > > I wrote up some thoughts on the question of using standardized mean
> > > differences to analyze outcomes measured as proportions:
> > > https://www.jepusto.com/mean-variance-relationships-and-smds/
> > > Thoughts, comments, questions, and critiques welcome.
> > >
> > > James
> > >
> > > On Mon, Oct 25, 2021 at 9:07 PM James Pustejovsky <jepusto using gmail.com> wrote:
> > > >
> > > > All I mean is that a skewed distribution or one with large outliers
> > > > does not necessarily *imply* that a mean-sd relationship exists. It
> > > > could be the result of one, but skewness might be due to something
> > > > else (such as selective reporting) instead.
> > > >
> > > > I would suggest that a well-behaved effect distribution is desirable
> > > > and appropriate to the extent that it indicates empirical regularity
> > > > of the phenomenon you're interested in. A less heterogeneous
> > > > distribution means that effects are more predictable (at least in the
> > > > corpus of studies that you're examining).
> > > >
> > > > On Mon, Oct 25, 2021 at 8:58 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > > >
> > > > > I thought the existence of outlying effect estimates under SMD and
> > > > > lack of it under LRR could attest to the existence of
> > > > > heterogeneity-generating artefacts like mean-sd relationships (and/or
> > > > > variation in measurement error) across the studies.
> > > > >
> > > > > If not, then, would you mind commenting on why a more symmetric and
> > > > > well-behaved effect distribution is equated with its appropriateness
> > > > > for a set of summaries (e.g., means & sds) from studies?
> > > > >
> > > > > Luke
> > > > >
> > > > > On Mon, Oct 25, 2021 at 8:47 PM James Pustejovsky <jepusto using gmail.com> wrote:
> > > > > >
> > > > > > Responses below.
> > > > > >
> > > > > > On Mon, Oct 25, 2021 at 4:21 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > > > > >
> > > > > > > Sure, thanks. Along the same lines, if I see that the unconditional
> > > > > > > distribution of the SMD estimates is multi-modal or right or left
> > > > > > > skewed (perhaps due to extreme outliers), but the unconditional
> > > > > > > distribution of the corresponding LRR estimates looks more symmetric
> > > > > > > and well-behaved, does that also empirically suggest a mean-sd
> > > > > > > relationship in one or more groups?
> > > > > >
> > > > > > I'm not sure that it implies a mean-sd relationship. But I think it
> > > > > > does suggest that LRR might be a more appropriate metric.
> > > > > >
> > > > > > > PS. Is there a reason for exploring the mean-sd relationship
> > > > > > > specifically in the control group?
> > > > > >
> > > > > > No, you could certainly examine the relationships in the treatment
> > > > > > group(s) as well.