[R-meta] metafor - specifying spatial random effects

Mon Aug 24 15:43:04 CEST 2020

Hi Grace,

The results for the two models are the same because in the 'spatial model' tau^2 is estimated to be 0 (and hence there is no heterogeneity corresponding to the spatial random effect) and in the second model the variance of the true effects for the different levels of 'Habitat3' is 0. So in essence, you would get the same results with:

rma.mv(yi=yi, V=vi, data=subby, method="REML")

Looking at the Q-test, I see that the test statistic is about half the size of the degrees of freedom. That's a bit unusual. The expected value of Q under the null hypothesis (that the true effects are homogeneous) is the df. Under homogeneity, that means that Q should fluctuate around df and sure, it should also sometimes be smaller than the df, but such a small Q-statistic is a bit unusual (a value this small or smaller should only occur with probability 1 - .9744). In essence, this indicates 'excessive homogeneity'. This could occur for various reasons:

1) By chance (there is, after all, still a 2.6% chance of seeing such a small Q-statistic and if somebody can win the lottery, then somebody can also get such a small Q, the latter actually being much more likely).

2) This could happen when the sampling variances are not computed correctly, such that they are way too large. Then any heterogeneity will just 'disappear' in the sampling variances.

3) The sampling errors are highly positively correlated, but treated as independent. Note that adding spatial random effects can account for spatial correlations in the underlying true effects, but not in the sampling errors. For the latter, one would have to construct a V matrix that properly reflects spatial correlations in the sampling errors.

This is in essence the same issue as when dealing with temporally autocorrelated effects. Autocorrelation in the sampling errors needs to be handled separately from adding autocorrelated random effects. See help(dat.fine1993) and help(dat.ishak2007) for examples of that and:

Musekiwa, A., Manda, S. O., Mwambi, H. G., & Chen, D. G. (2016). Meta-analysis of effect sizes reported at multiple time points using general linear mixed model. PLOS ONE, 11(10), e0164898.

Best,
Wolfgang

>-----Original Message-----
>From: Grace Pold [mailto:apold using umass.edu]
>Sent: Friday, 21 August, 2020 17:14
>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org <r-sig-
>meta-analysis using r-project.org>
>Subject: RE: metafor - specifying spatial random effects
>
>Dear Wolfgang,
>
>Sorry – I was mistaken about the output as one of the models didn’t re-run.
>
>So when both the spatial and non-spatial models show spatially-
>autocorrelated residuals (and an identical model), this means that the model
>fitting hasn’t accounted for any spatial signal in the data?
>
>#First, the spatial random effects model
>(heh3<-rma.mv(yi=yi, V=vi, random = ~ Decimal.Longtitude + Decimal.Latitude
>| Habitat3,data=subby,method="REML",dist="gcd", struct="SPGAU"))
>
>Multivariate Meta-Analysis Model (k = 27; method: REML)
>
>Variance Components:
>
>outer factor: Habitat3             (nlvls = 6)
>inner term:   ~Decimal.Longti[...] (nlvls = 11)
>
>             estim    sqrt  fixed
>tau^2       0.0000  0.0000     no
>rho        20.7485             no
>
>Test for Heterogeneity:
>Q(df = 26) = 13.8933, p-val = 0.9744
>
>Model Results:
>
>estimate      se     zval    pval    ci.lb   ci.ub
> -0.0292  0.0954  -0.3061  0.7595  -0.2161  0.1577
>
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>> #non-spatial model
>> (heh<-rma.mv(yi=yi, V=vi, random = list(~ 1 | Habitat3),
>data=subby,method="REML"))
>
>Multivariate Meta-Analysis Model (k = 27; method: REML)
>
>Variance Components:
>
>            estim    sqrt  nlvls  fixed    factor
>sigma^2    0.0000  0.0000      6     no  Habitat3
>
>Test for Heterogeneity:
>Q(df = 26) = 13.8933, p-val = 0.9744
>
>Model Results:
>
>estimate      se     zval    pval    ci.lb   ci.ub
> -0.0292  0.0954  -0.3061  0.7595  -0.2161  0.1577
>
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>>
>> #calculate Moran's I using the ape package
>> require(ape)
>> site.dists <- as.matrix(dist(cbind(subby$Decimal.Longtitude,
>subby$Decimal.Latitude)))
>>         site.dists.inv <- 1/site.dists
>>         diag(site.dists.inv) <- 0
>>       site.dists.inv[is.infinite(site.dists.inv)] <- 0
>> Moran.I(resid(heh3), site.dists.inv)
>$observed
>[1] -0.325002
>
>$expected
>[1] -0.03846154
>
>$sd
>[1] 0.1025168
>
>$p.value
>[1] 0.005189014
>
>> Moran.I(resid(heh), site.dists.inv)
>$observed
>[1] -0.325002
>
>$expected
>[1] -0.03846154
>
>$sd
>[1] 0.1025168
>
>$p.value
>[1] 0.005189014
>
>Thank you,
>Grace
>
>From: Viechtbauer, Wolfgang (SP)
>Sent: Friday, August 21, 2020 2:01 AM
>To: Grace Pold; r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-
>project.org>
>Subject: RE: metafor - specifying spatial random effects
>
>Dear Grace,
>
>Can you post the output of both models?
>
>Best,
>Wolfgang
>
>>-----Original Message-----
>>From: Grace Pold [mailto:apold using umass.edu]
>>Sent: Friday, 21 August, 2020 5:21
>>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org <r-sig-
>>meta-analysis using r-project.org>
>>Subject: RE: metafor - specifying spatial random effects
>>
>>Dear Wolfgang,
>>
>>I have a follow-up question.
>>
>>When comparing the spatial and the non-spatial models, I have different
>>parameter estimates as expected. However, the residuals are identical for
>>the two models. Is it expected that there would be the same residuals for
>>the spatial and non-spatial models?
>>
>>Thank you for your assistance,
>>
>>Grace
>>
>>From: Viechtbauer, Wolfgang (SP)
>>Sent: Thursday, August 20, 2020 4:28 AM
>>To: Grace Pold; r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-
>>project.org>
>>Subject: RE: metafor - specifying spatial random effects
>>
>>Dear Grace,
>>
>>you need to create a variable that is a constant, so:
>>
>>datasub$const <- 1
>>
>>and then you can use:
>>
>>random = ~ Longitude + Latitude | const
>>
>>The post you are referring to is outdated. At that time, the rma.mv()
>>function did not yet have spatial correlation structures. Now you can just
>>use the 'struct' argument to specify the spatial correlation structure and
>>the optimization is done for you.
>>
>>With respect to your second question: To me, the choice (whether to include
>>location as a fixed or a random effect) comes down to whether I am actually
>>interested in differences between the specific locations that make up the
>>spatial configuration of your data points (in which case I would use a
>fixed
>>effect) or I just want to account for possible dependency due to the
>spatial
>>configuration (in which case I would use a random effect).
>>
>>Best,
>>Wolfgang
>>
>>>-----Original Message-----
>>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-
>>project.org]
>>>On Behalf Of Grace Pold
>>>Sent: Thursday, 20 August, 2020 0:41
>>>To: r-sig-meta-analysis using r-project.org <r-sig-meta-analysis using r-project.org>
>>>Subject: [R-meta] metafor - specifying spatial random effects
>>>
>>>Hello,
>>>
>>>I have a dataset of effect sizes taken at different places in the northern
>>>hemisphere, sometimes in close proximity, and sometimes in distant
>>>locations.
>>>
>>>I have two questions. The first is specifically related to metafor and
>>model
>>>specification, and the second is more generally related to spatial random
>>>effects (which I don’t think I will get answered, but thought I would
>throw
>>>it in anyway).
>>>
>>>First, I couldn’t find any examples of how to specify the model to include
>>>spatial random effects outside of the function help text. So I was hoping
>>>someone could tell me if the following specification for a geographic
>>>distance model is correct:
>>>
>>>rma.mv(yi=yi, V=vi, mods=~moderator1, random = ~ Longitude+ Latitude|
>>>moderator1,data=datub,method="REML", dist="gcd", struct="SPEXP")
>>>
>>>when there is a moderator like habitat type in my analysis. And:
>>>
>>>rma.mv(yi=yi, V=vi, random = ~ Longitude+Latitude|StudyID,
>>>data=datasub,method="REML", dist="gcd", struct="SPEXP")
>>>
>>>when I do not include moderators (StudyID is just the unique datapoint ID
>>>and so each StudyID only has one effect size value associated with it). I
>>>chose gcd as the distance because the curvature of the earth matters in my
>>>data, and have no rationale for choosing SPEXP for structure.
>>>
>>>The help text says “Let d denote the distance between two points that
>share
>>>the same level of the outer variable (if all true effects are allowed to
>be
>>>spatially correlated, simply set outer to a constant)”, so intuitively I
>>>wanted to write the model as
>>>
>>>rma.mv(yi=yi, V=vi, random = ~ Longitude+Latitude|1,
>>>data=datasub,method="REML", dist="gcd", struct="SPEXP")
>>>
>>>(ie with a “1” as the “outer” variable rather than a unique datapoint ID)
>>>because there is no “distance between two points sharing the same level”
>of
>>>StudyID because those are unique. However, it gives me a “Error in
>>>eval(predvars, data, env) : object 'Latitude' not found” error unless I
>>>include one of the named variables.
>>>
>>>I also wanted to check that the “parameter optimization” mentioned in the
>>>blog post here [https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-
>>>November/000371.html] on a previous approach to including spatial
>>similarity
>>>matrices is accounted for in the most recent version of metafor. Or are
>>>there additional steps I would need to complete?
>>>
>>>Second, if you wanted to know whether to include geographic location as a
>>>categorical (ex. New York vs. Beijing) versus geographic distance random
>>>effect versus not at all, would you suggest extracting the residuals and
>>>checking to see which model has done the best job of removing spatial
>>>correlation, if any exists, using Moran’s I?
>>>
>>>Thank you so much for your time and input,
>>>
>>>Grace Pold
>>>Postdoctoral researcher
>>>Cal Poly NRES