[R-meta] AICc or variance components, which one matters more?

Sat Nov 13 18:43:55 CET 2021

Hi Wolfgang,

I'm fully with you, however, in my data only once 2 labs (labs 1 and
2) have collaborated on study 2. Specifically, part of study 2 has
been carried out by lab 1 (one row) and part of it by lab 2 (one row).
Except in this case, no such between-lab collaborations have ever
occurred in the data.

If such a between-lab collaboration didn't exist, I could directly go
for g1 (hierarchical model). But with this collaboration, there is
just a tiny possibility for g2 (crossed model) as well.

So, do you think AICc should be the basis of the comparison between g1
vs. g2 or the dominant data structure (ignoring the one exception)?

(g1=rma.mv(yi, vi, random = ~1|lab/study, data = dd))
(g2=rma.mv(yi, vi, random = list(~1|lab, ~1|study), data = dd))

Thanks,
Luke

On Sat, Nov 13, 2021 at 9:16 AM Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
> >>> I don't know why you think a model is better if the estimates of its variance
> >components are non-zero.
> >
> >Because of two things. 1) It maybe more desirable/efficient for a random factor
> >to explain some heterogeneity (as in g1) rather than not (as in g2). 2) The CI
> >for the variance component (using the full data) is narrower for lab in g1
> >relative to that in g2.
>
> One can debate 1) For example, if I fit a random-effects model and tau^2 ends up being estimated to be 0, then I would be happy, because this suggests that the true effects are homogeneous, which *greatly* simplifies the conclusions. In fact, my estimate of the pooled effect is then *more* efficient than if there was heterogeneity.
>
> As for 2) To me, that would not be a reason for picking one model over the other, at least not in this case where the two models are making fundamentally different assumptions about the structure of the data.
>
> >>> Personally, I would rather think about the meaning of these models and use the
> >one that makes more sense in the given context.
> >
> >Sure, but that's not always as straightforward. For instance, imagine in your
> >current data, the dominant structure is the hierarchy of A over B with a few
> >exceptions where this hierarchy can break (meaning A and B can also be considered
> >crossed).
> >
> >In such a situation, one may think that if my data was much larger, then the few
> >exceptions in the current data may actually turn out to be the dominant structure
> >(meaning A and B being crossed may be a more representative model of the
> >population structure of A and B) or may be that's just overthinking.
> >
> >So when meaning alone is not that conclusive, you may a bit rely on the
> >comparison across a set of candidate models. Here, AICc and how much
> >heterogeneity is explained by each model may at least help you pick the one that
> >more fully describes your current data.
> >
> >Curious to know your thoughts?
>
> If you want to compare models, then fit indices would be a useful approach (and if the models are nested, then LRTs could be used).
>
> >All the best,
> >Luke
> >
> >Sat, Nov 13, 2021, 6:56 AM Viechtbauer, Wolfgang (SP)
> ><wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> >Please see my comments below.
> >
> >Best,
> >Wolfgang
> >
> >>-----Original Message-----
> >>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
> >>Behalf Of Philippe Tadger
> >>Sent: Saturday, 13 November, 2021 11:03
> >>To: Luke Martinez; R meta
> >>Subject: Re: [R-meta] AICc or variance components, which one matters more?
> >>
> >>Dear Luke
> >>
> >>I will suggest to choose the model that can provide a reliable
> >>estimation of the variances. Usually a variance estimation equal to zero
> >>or a correlation estimation  equal to 1 or -1 are potential warning
> >>signal, your model is in the limit of the parameter space: so you can
> >>having an underestimation and/or a non-identifiable model. How can you
> >>be sure? Using the profile() function in each model. In your case model
> >>g1 is identifiable, but g2 is not identifiable. I'll choose g1.
> >
> >This is not correct. Both variance componments are identifiable in both models,
> >as the profile plots indicate. The estimate of the lab variance component in g2
> >just happens to be 0 (the peak of the likelihood profile is at 0).
> >
> >>On 13/11/2021 05:24, Luke Martinez wrote:
> >>> Hello Colleagues,
> >>>
> >>> I've fit two candidate models (g1 and g2).
> >>>
> >>> With g1, I get two non-zero variance components for 'lab' and 'study'.
> >>>
> >>> With g2, I get a zero variance component for 'lab' and a non-zero one
> >>> for 'study'.
> >>>
> >>> So, from the perspective of variance components, g1 seems like a better model.
> >
> >I don't know why you think a model is better if the estimates of its variance
> >components are non-zero.
> >
> >>> But when I compare the models using AICc, g2 seems like a better model.
> >
> >Personally, I would rather think about the meaning of these models and use the
> >one that makes more sense in the given context.
> >
> >>> I wonder, then, which criterion should I use to choose the model (AICc
> >>> or the variance components)?
> >>>
> >>> Thanks,
> >>> Luke
> >>> For reproducibility, I'm showing a somewhat similar situation with a
> >>> small section of my data below.
> >>>
> >>> m="
> >>>        lab study   yi         vi es_id
> >>>           1     1 1.04 0.48503768 1
> >>>           1     1 0.96 0.51076604 2
> >>>           1     2 1.71 0.05767389 3
> >>>           2     2 1.52 0.07539841 4
> >>>           1     3 1.91 0.31349510 5
> >>>           2     4 3.01 0.67910095 6
> >>>           2     4 3.62 0.50670360 7
> >>>           9     5 0.99 0.18297170 8
> >>>           9     5 0.43 0.37225851 9
> >>>           9     6 1.68 0.39072390 10
> >>>           9     6 1.25 0.02879550 11"
> >>> dd <- read.table(text=m,h=T)
> >>>
> >>> (g1=rma.mv(yi, vi, random = ~1|lab/study, data = dd))
> >>>                      estim    sqrt  nlvls  fixed     factor
> >>> sigma^2.1  0.1245  0.3529      3     no        lab
> >>> sigma^2.2  0.2535  0.5035      7     no  lab/study
> >>>
> >>> (g2=rma.mv(yi, vi, random = list(~1|lab, ~1|study), data = dd))
> >>>                       estim    sqrt  nlvls  fixed  factor
> >>> sigma^2.1  0.0000  0.0000      3     no     lab
> >>> sigma^2.2  0.5127  0.7160      6     no   study
> >>>
> >>> fitstats(g1,g2)[5,]
> >>>              g1          g2
> >>> AICc: 30.85992 29.73897
> >>> (for full data, AICc of g2 is more noticeably smaller than that of g1)