[R-meta] Do we assume multi-stage sampling of effect sizes in multi-level models?

Farzad Keyhan |@keyh@n|h@ @end|ng |rom gm@||@com
Thu Jul 22 04:19:02 CEST 2021

Dear Wolfgang,

Thank you for sharing your thoughts here. Yes, aside from CLT, "additive
phenomena" (the 'sum' in your comment) in nature tend to cluster heavily
around mid-level realizations and form normal distributions when observed
in large enough quantities (see, Breiman, 1968; at:
[but I would rather know what some of those factors that lead to how a
study is run are!]

Exchangeability is a shortcut assumption that allows probabilistic
structures of the observed data Y = {y1,...yn} to be 'similar' (not
identical as in iid data) by assuming that the data distribution's focal
parameter is a random variable (forming an upper-level distribution). That
is, instead of assuming that Y set is iid, we assume observations in the Y
set are "conditionally [on a random parameter] independent" to avoid
drawing inference from an *n*-variate joint distribution.

And yes, exchangeability allows being insensitive to the order in which a
set of data points are observed e.g., Pr(Y1 = 1, Y2 =
0, Y3 = 1) = Pr(Y1 = 0, Y2 = 1, Y3 = 1) = Pr(Y1 = 1, Y2 = 1, Y3 = 0).

At the cost of digressing a bit, it is generally interesting that Bayesian
(non-multilevel) inference and frequentist multilevel inference esp. via
REML both capitalize on exchangeability.

Once again, thanks for reminding me of exchangeability,

I also agree with your very last point!

All the best,

On Wed, Jul 21, 2021 at 2:50 PM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:

> Let me chime in here. I am going to focus on the 'multilevel sampling'
> that underlies the standard random-effects model (which can actually be
> regarded as a two-level model) - not some further complications that arise
> if there are multiple outcomes / effects for the same study.
> The 'sampling' that happens at the level of the observed effects is the
> same as what we assume in essentially all of statistics. All statistics
> have a sampling distribution, whether they are a simple mean, the test
> statistic of a t-test, or a standardized mean difference. The sampling
> distributions of these statistics can be assumed to arise through random
> sampling, but hardly ever has such random sampling taken place in practice
> (leaving aside people doing surveys where there is a lot of emphasis on
> doing proper sampling). We just assume that the people who participated in
> our study (or whatever the unit of analysis is) are like a random sample
> from some population of people and that if we would run the same study all
> over again under identical circumstances, the same random processes would
> lead to a new sample that again is like a random sample from that same
> population. That doesn't imply that it's a random sample from a population
> we care about, but it can be assumed to be a random sample from *some*
> population. Again, all of this applies generally, not just to meta-analysis
> (where we just happen to have sampling distributions for the various effect
> size or outcome measures).
> Sidenote: There is another way one can motivate the concept of a sampling
> distribution which does not involve random sampling, but relies on the idea
> of randomization. For some discussion of this, see here:
> https://stats.stackexchange.com/questions/13607/can-non-random-samples-be-analyzed-using-standard-statistical-tests/13616#13616
> What gets a bit more philosophical is the concept of a 'population of
> studies' and that the studies we have are a random sample from that
> (usually purely hypothetical) population. I used to joke that believing in
> that hypothetical population is like believing in UFOs -- which
> coincidentally also look a bit like a normal distribution:
> https://www.closeup.de/media/oart_0/oart_i/oart_60621/thumbs/816944_2109747.jpg
> (yes, I used to be a big X-Files fan ...). But one could motivate this
> idea on the following grounds: Imagine that the way a study is run depends
> on many factors, each one of them one could just as well decide one way or
> another. Now imagine there is a large number of such factors - whose 'sum'
> essentially leads to the specific way a study is run, which in turn
> determines what the true outcome/effect is for such a study. Then based on
> the central limit theorem (not in terms of the number of studies, but in
> terms of the number of these small factors that get added up), the true
> effects would actually have a normal distribution and the true effects of
> the studies we have are a 'sample' from that distribution. So the sampling
> here is not so much that we, as meta-analysts, are really sampling studies,
> but that the studies themselves have 'sampled' the values of these various
> factors.
> This thinking may in fact be related to another way that some have
> motivated the idea of a random-effects model, which involves the concept of
> exchangability:
> https://en.wikipedia.org/wiki/Exchangeable_random_variables
> so we don't have to assume random sampling, 'just' that the true effects
> are exchangable. I can't find the reference(s) right now where this is
> discussed and I can't really say whether this really helps.
> Note that similar discussions arise in other contexts, for example whether
> it makes sense to use inferential statistics when one has actually sampled
> an entire population. Does it make sense to run a t-test then? Use of
> inferential statistics in such a context is often motivated on grounds that
> the specific population we have is just one of many that could have arisen.
> Now it seems like we are back to UFOs again ... I'll stop here. Also,
> discussions about this stuff get a lot of more fun when you've had a couple
> beers.
> Best,
> Wolfgang
> >-----Original Message-----
> >From: R-sig-meta-analysis [mailto:
> r-sig-meta-analysis-bounces using r-project.org] On
> >Behalf Of Farzad Keyhan
> >Sent: Wednesday, 21 July, 2021 20:41
> >To: James Pustejovsky
> >Cc: R meta
> >Subject: Re: [R-meta] Do we assume multi-stage sampling of effect sizes
> in multi-
> >level models?
> >
> >I have (and had) no issue with the first stage of sampling (simple random
> >sampling of studies). I also have no problem assuming (only
> >epistemologically) that there could be a "literature [that] has measured a
> >very large number of outcomes" (whatever 'outcome' means which may
> >introduce another stage of [random] sampling).
> >
> >My problem is with the second stage of the sampling. Particularly, to
> think
> >that knowledge at this stage can be formed on a simple random
> >sampling basis of some universe and not a non-probability (and perhaps
> >thoughtfully biased like purposive sampling) one is a bit
> counter-intuitive.
> >
> >In the end, compared to primary data, simple random sampling of effect
> >sizes (at any stage of sampling) seems a bit restricted.
> >
> >Once again many thanks for sharing your expertise (a question about your
> >expanding range paper to follow)
> >Fred
> >
> >On Wed, Jul 21, 2021 at 10:53 AM James Pustejovsky <jepusto using gmail.com>
> >wrote:
> >
> >> I'm not sure i agree about the theoretical impossibility of MLMA.
> >>
> >> Consider that the regular old random effects model also posits that we
> are
> >> sampling studies from some population. Usually that population is
> >> hypothetical (the set of possible studies that could conceivably be
> >> conducted on the topic). But sometimes we may identify a very large
> body of
> >> literature and then literally draw a random sample of records for
> purposes
> >> of coding (because coding is expensive and we have limited resources).
> >>
> >> One could imagine doing the same in a multi-stage setting, where every
> >> study in the literature has measured a very large number of outcomes. We
> >> first sample studies, then (again due to resource constraints), sample
> only
> >> a few of the outcomes from each study for purposes of effect size
> >> calculation. This is less plausible as a physical process, admittedly.
> But
> >> we could imagine that the primary investigators are engaging in
> something
> >> akin to this when they design their primary studies. Ideally, they would
> >> measure many different outcomes using many different
> >> instruments/scales/whatnot. But due to resource constraints, they can
> >> actually only collect a few measurements. Perhaps they choose
> >> instruments/scales more or less at random?
> >>
> >> On Wed, Jul 21, 2021 at 10:43 AM Farzad Keyhan <f.keyhaniha using gmail.com>
> >> wrote:
> >>
> >>> Thanks, James. You are right about what I meant by epistemological vs.
> >>> ontological. But the problem is that in the case of "primary data", it
> is
> >>> "theoretically possible" to follow a multi-stage plan but in many
> cases we
> >>> may not "afford" to do it, and so it doesn't happen (always
> >>> epistemologically plausible, but at times ontologically implausible).
> >>>
> >>> But in the case of multilevel meta-regression, it's "not even
> >>> theoretically possible" to assume so. Of course, I fully understand
> that
> >>> there is no remedy and it is what it is. But I just wanted to make
> sure I'm
> >>> not way off on this as a non-stats person.
> >>>
> >>> Thank you again, for your expertise and dedication,
> >>> Fred
> >>>
> >>> ps. My colleagues and I have run into a question reading your
> >>> expanding range paper (and applying it to our ongoing meta project) but
> >>> I'll, if you don't mind, ask that on this forum later.
> >>>
> >>> On Wed, Jul 21, 2021 at 10:07 AM James Pustejovsky <jepusto using gmail.com>
> >>> wrote:
> >>>
> >>>> Hi Fred,
> >>>>
> >>>> This is an interesting question, for sure, and I would love to hear
> how
> >>>> others think about it.
> >>>>
> >>>> My own perspective: I agree with your interpretation in that the
> >>>> assumptions of the multi-level meta-analysis (MLMA) model posit a
> two-stage
> >>>> sampling process, where we first sample studies from some population
> of
> >>>> possible studies and then sample effect sizes from the population of
> effect
> >>>> sizes that *could have been measured* within each of those studies.
> The
> >>>> overall average effect size parameter in the MLMA is then the average
> of
> >>>> study-specific average effect size parameters, which in turn are
> averages
> >>>> over a (hypothetical) set of effects that could have been assessed.
> >>>>
> >>>> An implication of this assumption is that the MLMA model attributes
> >>>> additional uncertainty to studies that measure only a single outcome.
> This
> >>>> happens because it treats those studies as having measured just one
> of many
> >>>> possible outcomes, rather than (for instance) as having measured the
> single
> >>>> gold-standard outcome given the constructs/question under
> investigation. I
> >>>> do worry about whether this assumption is reasonable, but at the
> moment I
> >>>> don't have any great ideas about how to probe it.
> >>>>
> >>>> Of course, just as with multi-level modeling of primary data, the
> >>>> assumptions of the model don't---and needn't---necessarily match up
> with
> >>>> the actual physical process used to collect the data. (I think this
> is what
> >>>> you were getting at in differentiating between the epistomology and
> the
> >>>> ontology?) Multi-level models are very commonly used with data
> collected
> >>>> through means other than multi-stage random sampling, and I've never
> heard
> >>>> of a meta-analytic dataset being assembled through a multi-stage
> sampling
> >>>> of effect size information. Whether using MLMA is a reasonable
> statistical
> >>>> strategy depends on a) whether the model's assumptions are a
> reasonable,
> >>>> stylized approximation of the process you're investigating and b) the
> >>>> robustness of the approach to violations of its assumptions.
> >>>>
> >>>> James
> >>>>
> >>>> On Tue, Jul 20, 2021 at 11:23 AM Farzad Keyhan <f.keyhaniha using gmail.com
> >
> >>>> wrote:
> >>>>
> >>>>> Hello All,
> >>>>>
> >>>>> Applying multi-level models to "raw data'' assumes that the data have
> >>>>> been
> >>>>> collected via a multi-stage sampling plan (e.g.,first randomly
> selecting
> >>>>> schools, then randomly selecting students from within those selected
> >>>>> schools) which makes the student data from within each school not be
> iid
> >>>>> distributed (hierarchical dependence).
> >>>>>
> >>>>> But in meta-analysis, do we need to assume that a multi-stage
> sampling
> >>>>> of
> >>>>> "effect sizes" (first randomly selecting some studies, then selecting
> >>>>> some
> >>>>> effect sizes from within those studies) has occurred to justify the
> use
> >>>>> of
> >>>>> multilevel meta-regression models?
> >>>>>
> >>>>> I would say, epistomologically yes (but ontologically no), but I
> wonder
> >>>>> what meta-analysis experts think?
> >>>>>
> >>>>> Thank you,
> >>>>> Fred

	[[alternative HTML version deleted]]

More information about the R-sig-meta-analysis mailing list