[R-meta] Accounting for uncertainty in a mixed-effects regression
Viechtbauer Wolfgang (SP)
wolfgang.viechtbauer at maastrichtuniversity.nl
Sun Feb 4 13:57:45 CET 2018
Yes, I just wanted to tie up those loose ends before getting to the more recent discussion.
As James indicated, you cannot predict for a level that was not included in the original dataset. Also, factors are turned into a bunch of dummy variables, so if you want to use predict(), you have to specify the values of the dummy variables instead of including a factor/character variable in 'newmods'. See:
As for the actual problem:
In levels of 'Biome' with a small number of studies, the SE should tend to be larger (all else equal, a larger k for a level will lead to a decrease in the SE for the level). So in that sense, the model should reflect the larger uncertainty in levels of 'Biome' with little information.
dat <- get(data(dat.bcg))
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res <- rma(yi, vi, mods = ~ alloc - 1, data=dat)
plot(c(table(dat$alloc)), res$se, pch=19)
So, before thinking about other methods, how does that pan out in your data?
>From: Cesar Terrer Moreno [mailto:cesar.terrer at me.com]
>Sent: Sunday, 04 February, 2018 13:52
>To: Viechtbauer Wolfgang (SP)
>Cc: r-sig-meta-analysis at r-project.org
>Subject: Re: [R-meta] Accounting for uncertainty in a mixed-effects
>I re-formulated this question more specifically and focused after some
>people helped with some initial issues. Please, see this newer question
>labeled as: “Accounting for uncertainty in a mixed-effects regression”.
>In this question, I’ve been initially helped by James, and I have learnt
>that the variable I need as a driver of uncertainty needs to be included
>as moderator. However, an issue with the predict question using
>categorical moderators persists.
>On 2 Feb 2018, at 19:19, Cesar Terrer Moreno <cesar.terrer at me.com> wrote:
>Thanks, this is very helpful.
>However, I can’t manage to make the predict function work. My entire
>model is a bit more complicated than in my original question, also
>including an interaction as moderator and Biome as you just suggested :
>meta.mv <- rma.mv(es, var, data=df, mods= ~ 1 + MAP + MAT*CO2dif +
>Now, when I use predict:
># correspondence of variables between df (meta-analysis database) and
>s.df (global data of the predictors to predict the effect globally) is:
># MAT == temperature, MAP == precipitation, Biome == Biome, CO2inc = 300
>pred.mv <- predict(meta.mv, newmods = cbind(s.df$precipitation,
>s.df$temperature, CO2inc, s.df$temperature*CO2inc,s.df$Biome))
>Error in predict.rma(ECMmv, newmods = cbind(s.df$precipitation,
> Dimensions of 'newmods' do not match dimensions of the model.
>I guess this error is because i) Biome is categorical, and ii) number of
>factor levels in df$Biome and s.df$Biome is different:
>"Boreal_Forest" "Cropland" "Grassland" "Shrubland"
> "Temperate_Forest" "Tropical_Forest”
>"" "Boreal_Forest" "Cropland" "Grassland"
>"Mixed" "Nodata" "Other" "Shrubland"
>  "Temperate_Forest" "Tropical_Forest"
>This is because there are “biomes” in the world (s.df) that are not
>included in the dataset of experiments of the meta-analysis (s), although
>all Biomes in s are included in s.df with the same name.
>How can I make the predict function work? Thanks
>On 2 Feb 2018, at 16:38, James Pustejovsky <jepusto at gmail.com> wrote:
>I think you will need to include ecosystem as a predictor (i.e., in the
>fixed effect part of the model, not just in the random effects structure)
>in order to account for the greater uncertainty about tropical
>ecosystems. Without doing so, the predictions for the tropics will be
>generated based on ALL of the ecosystems in the model (controlling for
>precipitation and temperature), and so their uncertainty will be driven
>by the total sample size. (It would even be possible to exclude all of
>the data from tropical ecosystems, fit the model, and yet still generate
>predictions for the tropics.)
>So I would recommend putting it in, as something like:
>meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1 + MAP +
>MAT + Biome)
>(Or perhaps it's not necessary to put it in the random effects structure,
>if the variability in true effects appears to be unrelated to ecosystem.)
>On Fri, Feb 2, 2018 at 5:48 AM, Cesar Terrer Moreno <cesar.terrer at me.com>
>I have calculated an effect size along a dataset of experiments
>distributed worldwide through a mixed-effects meta-regression. The effect
>size in the dataset depends on climate (y ~ temperature + precipitation):
> meta <- rma(es, var, data=df, mods= ~ 1 + precip + temp)
>As I have data for temperature and precipitation for virtually all points
>on Earth, I have upscaled this effect and standard error (SE) globally,
>that is, applying the model to predict the effect for all points on
>Earth, thus creating a gridded map:
> s <- stack(temperature, precipitation) # Raster stack of the maps
>of temperature and precipitation
> s.df <- as.data.frame(s,xy=TRUE) # Transform raster stack to
>data.frame with coordinates
> # predict ES and SE for all points with temperature and
> pred <- predict(meta, newmods = cbind(s.df$precipitation,
>The resulting "upscaled" predictions of SE thus represent the meta-
>analysis derived uncertainty. What this SE does not reflect is the
>uncertainty derived from the fact that the number of experiments in the
>dataset used to run the meta-regression is obviously limited, yet I
>predicted the effect globally, even in areas not covered by experiments.
>In particular, it is very obvious in the dataset that tropical ecosystems
>are not well represented, with just a couple of experiments in the meta-
>analysis. Therefore, the upscaled effect should be more uncertain in
>tropical regions. In the current version, however, SE is not particularly
>higher in the tropics than in other areas, because the model just depends
>on climate, not ecosystem representativeness. I would like to add a
>further level of uncertainty to SE based on the number of studies per
>type of ecosystem in the dataset. The idea is that ecosystems that are
>poorly sampled should have a higher SE than ecosystems with plenty of
>measurements in the dataset.
>Ecosystem type does not appear to explain variation in the effect, reason
>I haven’t included it as a moderator. This could just be the consequence
>of the low sample size of some ecosystem types. Regardless, ecosystem
>type should be forced in the model somehow to account for the uncertainty
>associated with the low sample size of some types of ecosystem.
>I have tried this:
> meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1 +
>MAP + MAT)
> pred.mv <- predict(meta.mv,
> newmods = cbind(s.df$precipitation,
> random= ~1|s.df$Biome)
>But the predictions of SE in pred.mv still show rather low SE values in
>poorly sampled areas.
>How can I force the model to predict larger errors in ecosystem types
>with low sample size? Should I include ecosystem type as a random or
>fixed effect? How should the “predict" function be called?
More information about the R-sig-meta-analysis