[R-meta] Do we assume multi-stage sampling of effect sizes in multi-level models?

Wed Jul 21 20:40:42 CEST 2021

I have (and had) no issue with the first stage of sampling (simple random
sampling of studies). I also have no problem assuming (only
epistemologically) that there could be a "literature [that] has measured a
very large number of outcomes" (whatever 'outcome' means which may
introduce another stage of [random] sampling).

My problem is with the second stage of the sampling. Particularly, to think
that knowledge at this stage can be formed on a simple random
sampling basis of some universe and not a non-probability (and perhaps
thoughtfully biased like purposive sampling) one is a bit counter-intuitive.

In the end, compared to primary data, simple random sampling of effect
sizes (at any stage of sampling) seems a bit restricted.

Once again many thanks for sharing your expertise (a question about your
expanding range paper to follow)
Fred

On Wed, Jul 21, 2021 at 10:53 AM James Pustejovsky <jepusto using gmail.com>
wrote:

> I'm not sure i agree about the theoretical impossibility of MLMA.
>
> Consider that the regular old random effects model also posits that we are
> sampling studies from some population. Usually that population is
> hypothetical (the set of possible studies that could conceivably be
> conducted on the topic). But sometimes we may identify a very large body of
> literature and then literally draw a random sample of records for purposes
> of coding (because coding is expensive and we have limited resources).
>
> One could imagine doing the same in a multi-stage setting, where every
> study in the literature has measured a very large number of outcomes. We
> first sample studies, then (again due to resource constraints), sample only
> a few of the outcomes from each study for purposes of effect size
> calculation. This is less plausible as a physical process, admittedly. But
> we could imagine that the primary investigators are engaging in something
> akin to this when they design their primary studies. Ideally, they would
> measure many different outcomes using many different
> instruments/scales/whatnot. But due to resource constraints, they can
> actually only collect a few measurements. Perhaps they choose
> instruments/scales more or less at random?
>
> On Wed, Jul 21, 2021 at 10:43 AM Farzad Keyhan <f.keyhaniha using gmail.com>
> wrote:
>
>> Thanks, James. You are right about what I meant by epistemological vs.
>> ontological. But the problem is that in the case of "primary data", it is
>> "theoretically possible" to follow a multi-stage plan but in many cases we
>> may not "afford" to do it, and so it doesn't happen (always
>> epistemologically plausible, but at times ontologically implausible).
>>
>> But in the case of multilevel meta-regression, it's "not even
>> theoretically possible" to assume so. Of course, I fully understand that
>> there is no remedy and it is what it is. But I just wanted to make sure I'm
>> not way off on this as a non-stats person.
>>
>> Thank you again, for your expertise and dedication,
>> Fred
>>
>> ps. My colleagues and I have run into a question reading your
>> expanding range paper (and applying it to our ongoing meta project) but
>> I'll, if you don't mind, ask that on this forum later.
>>
>> On Wed, Jul 21, 2021 at 10:07 AM James Pustejovsky <jepusto using gmail.com>
>> wrote:
>>
>>> Hi Fred,
>>>
>>> This is an interesting question, for sure, and I would love to hear how
>>> others think about it.
>>>
>>> My own perspective: I agree with your interpretation in that the
>>> assumptions of the multi-level meta-analysis (MLMA) model posit a two-stage
>>> sampling process, where we first sample studies from some population of
>>> possible studies and then sample effect sizes from the population of effect
>>> sizes that *could have been measured* within each of those studies. The
>>> overall average effect size parameter in the MLMA is then the average of
>>> study-specific average effect size parameters, which in turn are averages
>>> over a (hypothetical) set of effects that could have been assessed.
>>>
>>> An implication of this assumption is that the MLMA model attributes
>>> additional uncertainty to studies that measure only a single outcome. This
>>> happens because it treats those studies as having measured just one of many
>>> possible outcomes, rather than (for instance) as having measured the single
>>> gold-standard outcome given the constructs/question under investigation. I
>>> do worry about whether this assumption is reasonable, but at the moment I
>>> don't have any great ideas about how to probe it.
>>>
>>> Of course, just as with multi-level modeling of primary data, the
>>> assumptions of the model don't---and needn't---necessarily match up with
>>> the actual physical process used to collect the data. (I think this is what
>>> you were getting at in differentiating between the epistomology and the
>>> ontology?) Multi-level models are very commonly used with data collected
>>> through means other than multi-stage random sampling, and I've never heard
>>> of a meta-analytic dataset being assembled through a multi-stage sampling
>>> of effect size information. Whether using MLMA is a reasonable statistical
>>> strategy depends on a) whether the model's assumptions are a reasonable,
>>> stylized approximation of the process you're investigating and b) the
>>> robustness of the approach to violations of its assumptions.
>>>
>>> James
>>>
>>>
>>> On Tue, Jul 20, 2021 at 11:23 AM Farzad Keyhan <f.keyhaniha using gmail.com>
>>> wrote:
>>>
>>>> Hello All,
>>>>
>>>> Applying multi-level models to "raw data'' assumes that the data have
>>>> been
>>>> collected via a multi-stage sampling plan (e.g.,first randomly selecting
>>>> schools, then randomly selecting students from within those selected
>>>> schools) which makes the student data from within each school not be iid
>>>> distributed (hierarchical dependence).
>>>>
>>>> But in meta-analysis, do we need to assume that a multi-stage sampling
>>>> of
>>>> "effect sizes" (first randomly selecting some studies, then selecting
>>>> some
>>>> effect sizes from within those studies) has occurred to justify the use
>>>> of
>>>> multilevel meta-regression models?
>>>>
>>>> I would say, epistomologically yes (but ontologically no), but I wonder
>>>> what meta-analysis experts think?
>>>>
>>>> Thank you,
>>>> Fred
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-meta-analysis mailing list
>>>> R-sig-meta-analysis using r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>>
>>>

	[[alternative HTML version deleted]]