[R-meta] mean-variance relationships introduces additional heterogeneity, how?

Tue Nov 2 23:41:56 CET 2021

Yes, sources of methodologically-driven heterogeneity can be even more
in scenario B (looking forward to another post on this;-). But at any
rate, I think all of this shows that defining the effect size
parameter in the SMD metric in lots of situations where researchers
make their own tests, measures, instruments or in young fields where
established measures are yet to be developed, or when behavior is
difficult to collect etc., SMD may not be a viable option.

For SMD fans in these situations, LRR seems to be a good alternative.

Thanks again,
Luke

On Tue, Nov 2, 2021 at 5:03 PM James Pustejovsky <jepusto using gmail.com> wrote:
>
> In scenario (B), things would get more complicated if the items used
> in one study differ in terms of difficulty or discrimination from the
> items used in another study. In the model you sketched, imagine that
> different studies have different cutpoints, e.g. for study i, you have
>
> latent = rnorm(N, mean=0) # true latent construct on which the test is built
> sentence1_reponses = latent + rnorm(N, mean=0, sd=0.6) # continuous
> sentence 1 responses
>
> a <- rnorm(1, 0, 0.4) # difficulty shift
> b <- rgamma(1, shape = 10, scale = 0.1) # discrimination stretch
> cutpoints <- b * c(-Inf, -1.5, 1,1.5,Inf) + a
> (likert_sentence1_reponses = findInterval(sentence1_reponses, vec = cutpoints))
>
> On Tue, Nov 2, 2021 at 4:50 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> >
> > Sure, in the (B) scenario, though, I think, things may be a bit
> > different. Because the construct on which the test is built itself is
> > often taken to be continuous.
> >
> > latent = rnorm(N, mean=-2) # true latent construct on which the test is built
> >
> > sentence1_reponses = latent + rnorm(N, mean=0, sd=0.6) # continuous
> > sentence 1 responses
> >
> > (likert_sentence1_reponses = findInterval(item1, vec=c(-Inf, -1.5, 1,1.5,Inf)))
> >
> > On Tue, Nov 2, 2021 at 4:34 PM James Pustejovsky <jepusto using gmail.com> wrote:
> > >
> > > I would think the same concerns I described for poisson and binomial
> > > distributions *could* apply in these situations in a similar way.
> > > However, it depends entirely on the distributions governing the
> > > measurements and whether they exhibit mean-variance relationships. I
> > > would guess that the diagnostics I sketched out in my post might be
> > > helpful for investigating such concerns with these types of
> > > measurements as well (again, just speculation though).
> > >
> > > James
> > >
> > > On Tue, Nov 2, 2021 at 3:30 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > >
> > > > Hi James,
> > > >
> > > > That clears it up, thank you so much! This may be a can of worms (so I
> > > > only expect your general reflections), can LRR also be preferred over
> > > > SMD for a couple of other count-based scenarios where the data
> > > > generating process for the each subject's overall test response can be
> > > > one of:
> > > >
> > > > (A) Multinomial distribution
> > > > (B) Ordered categorical distribution
> > > >
> > > > For (A), imagine students in each paper were given a test to circle
> > > > errors out of N underlined words in each of T # of sentences (thus,
> > > > for each sentence, a subject's response may follow a categorical
> > > > distribution, right?
> > > > https://en.wikipedia.org/wiki/Categorical_distribution).
> > > >
> > > > For (B), imagine students in each paper were given a test with T # of
> > > > sentences and asked how accurate each sentence looked: completely
> > > > inaccurate (-2), somewhat inaccurate (-1), unable to judge (0),
> > > > somewhat accurate (1), completely accurate (2).
> > > >
> > > > In both cases, each group's performance is summarized by its mean and
> > > > sd in the papers (again, for simplicity, let's imagine a
> > > > one-effect-size-per-study case).
> > > >
> > > > Thanks,
> > > > Luke
> > > > (ps. It actually may be the case that these two new scenarios happen
> > > > in the same meta-analysis that includes other papers where subjects'
> > > > overall test responses are binomially distributed. So, an effect size
> > > > parameter invariant/less sensitive to these data generation processes
> > > > is desperately needed.)
> > > >
> > > >
> > > >
> > > > On Tue, Nov 2, 2021 at 10:31 AM James Pustejovsky <jepusto using gmail.com> wrote:
> > > > >
> > > > > Hi Luke,
> > > > >
> > > > > Sure. I mean that the best-fit line is something like
> > > > >
> > > > > mu-B = beta0 + beta1 mu-A
> > > > >
> > > > > But if beta0 = 0, then mu-B = beta1 mu-A, or
> > > > >
> > > > > beta1 = mu-B / mu-A,
> > > > >
> > > > > so the two means are proportionally related, which is what the
> > > > > response ratio metric describes.
> > > > >
> > > > > On the other hand, if we had a non-zero beta0 but had beta-1 = 1, then
> > > > > mu-B = beta0 + mu-A, or
> > > > >
> > > > > beta0 = mu-B - mu-A,
> > > > >
> > > > > so the two means differ by a constant, which is what the risk
> > > > > difference metric (or difference-in-proportions) describes.
> > > > >
> > > > > James
> > > > >
> > > > > On Tue, Nov 2, 2021 at 10:19 AM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > > > >
> > > > > > Hi James,
> > > > > >
> > > > > > Thanks a lot for investing so much effort into my question! Let me ask
> > > > > > a quick question regarding the second diagnostic in your post.
> > > > > >
> > > > > > In your post, you note that *"[Since] there is a strong linear
> > > > > > relationship between the two [groups'] means, with a best-fit line
> > > > > > that might go through the origin. . . the response ratio might be an
> > > > > > appropriate metric."*
> > > > > >
> > > > > > Could you please elaborate on how this speaks to the appropriateness
> > > > > > of LRR over SMD?
> > > > > >
> > > > > > Luke
> > > > > >
> > > > > > On Tue, Nov 2, 2021 at 8:12 AM James Pustejovsky <jepusto using gmail.com> wrote:
> > > > > > >
> > > > > > > HI Luke and listserv,
> > > > > > >
> > > > > > > I wrote up some thoughts on the question of using standardized mean
> > > > > > > differences to analyze outcomes measured as proportions:
> > > > > > > https://www.jepusto.com/mean-variance-relationships-and-smds/
> > > > > > > Thoughts, comments, questions, and critiques welcome.
> > > > > > >
> > > > > > > James
> > > > > > >
> > > > > > > On Mon, Oct 25, 2021 at 9:07 PM James Pustejovsky <jepusto using gmail.com> wrote:
> > > > > > > >
> > > > > > > > All I mean is that a skewed distribution or one with large outliers
> > > > > > > > does not necessarily *imply* that a mean-sd relationship exists. It
> > > > > > > > could be the result of one, but skewness might be due to something
> > > > > > > > else (such as selective reporting) instead.
> > > > > > > >
> > > > > > > > I would suggest that a well-behaved effect distribution is desirable
> > > > > > > > and appropriate to the extent that it indicates empirical regularity
> > > > > > > > of the phenomenon you're interested in. A less heterogeneous
> > > > > > > > distribution means that effects are more predictable (at least in the
> > > > > > > > corpus of studies that you're examining).
> > > > > > > >
> > > > > > > > On Mon, Oct 25, 2021 at 8:58 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > I thought the existence of outlying effect estimates under SMD and
> > > > > > > > > lack of it under LRR could attest to the existence of
> > > > > > > > > heterogeneity-generating artefacts like mean-sd relationships (and/or
> > > > > > > > > variation in measurement error) across the studies.
> > > > > > > > >
> > > > > > > > > If not, then, would you mind commenting on why a more symmetric and
> > > > > > > > > well-behaved effect distribution is equated with its appropriateness
> > > > > > > > > for a set of summaries (e.g., means & sds) from studies?
> > > > > > > > >
> > > > > > > > > Luke
> > > > > > > > >
> > > > > > > > > On Mon, Oct 25, 2021 at 8:47 PM James Pustejovsky <jepusto using gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Responses below.
> > > > > > > > > >
> > > > > > > > > > On Mon, Oct 25, 2021 at 4:21 PM Luke Martinez <martinezlukerm using gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Sure, thanks. Along the same lines, if I see that the unconditional
> > > > > > > > > > > distribution of the SMD estimates is multi-modal or right or left
> > > > > > > > > > > skewed (perhaps due to extreme outliers), but the unconditional
> > > > > > > > > > > distribution of the corresponding LRR estimates looks more symmetric
> > > > > > > > > > > and well-behaved, does that also empirically suggest a mean-sd
> > > > > > > > > > > relationship in one or more groups?
> > > > > > > > > >
> > > > > > > > > > I'm not sure that it implies a mean-sd relationship. But I think it
> > > > > > > > > > does suggest that LRR might be a more appropriate metric.
> > > > > > > > > >
> > > > > > > > > > > PS. Is there a reason for exploring the mean-sd relationship
> > > > > > > > > > > specifically in the control group?
> > > > > > > > > >
> > > > > > > > > > No, you could certainly examine the relationships in the treatment
> > > > > > > > > > group(s) as well.