[R-meta] Accounting for uncertainty in a mixed-effects regression

James Pustejovsky jepusto at gmail.com
Fri Feb 2 19:58:46 CET 2018


Can you send a reproducible example of the issue?

As far as there being more biomes in the world than there are in your data,
that is a really modeling issue (a question about the science), rather than
an issue with the predict function: how should you extrapolate from the
biomes for which you have experimental data to the biomes that are not
observed in the data?


On Fri, Feb 2, 2018 at 12:19 PM, Cesar Terrer Moreno <cesar.terrer at me.com>

> Dear James,
> Thanks, this is very helpful.
> However, I can’t manage to make the predict function work. My entire model
> is a bit more complicated than in my original question, also including an
> interaction as moderator and Biome as you just suggested :
> meta.mv <- rma.mv(es, var, data=df, mods= ~ 1 + MAP + MAT*CO2dif + Biome))
> Now, when I use predict:
> # correspondence of variables between df (meta-analysis database) and s.df
> (global data of the predictors to predict the effect globally) is:
> # MAT == temperature, MAP == precipitation, Biome == Biome, CO2inc = 300
> pred.mv <- predict(meta.mv,  newmods = cbind(s.df$precipitation,
> s.df$temperature, CO2inc, s.df$temperature*CO2inc,s.df$Biome))
> Error in predict.rma(ECMmv, newmods = cbind(s.df$precipitation,
> s.df$temperature,  :
>   Dimensions of 'newmods' do not match dimensions of the model.
> I guess this error is because i) Biome is categorical, and ii) number of
> factor levels in df$Biome and s.df$Biome is different:
> > levels(df$Biome)
> [1] "Boreal_Forest"    "Cropland"         "Grassland"        "Shrubland"
> [5] "Temperate_Forest" "Tropical_Forest”
> > levels(s.df$Biome)
>  [1] ""                 "Boreal_Forest"    "Cropland"         "Grassland"
>  [5] "Mixed"            "Nodata"           "Other"            "Shrubland"
>  [9] "Temperate_Forest" "Tropical_Forest"
> This is because there are “biomes” in the world (s.df) that are not
> included in the dataset of experiments of the meta-analysis (s), although
> all Biomes in s are included in s.df with the same name.
> How can I make the predict function work? Thanks
> César
> On 2 Feb 2018, at 16:38, James Pustejovsky <jepusto at gmail.com> wrote:
> Cesar,
> I think you will need to include ecosystem as a predictor (i.e., in the
> fixed effect part of the model, not just in the random effects structure)
> in order to account for the greater uncertainty about tropical ecosystems.
> Without doing so, the predictions for the tropics will be generated based
> on ALL of the ecosystems in the model (controlling for precipitation and
> temperature), and so their uncertainty will be driven by the total sample
> size. (It would even be possible to exclude all of the data from tropical
> ecosystems, fit the model, and yet still generate predictions for the
> tropics.)
> So I would recommend putting it in, as something like:
> meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1 + MAP +
> MAT + Biome)
> (Or perhaps it's not necessary to put it in the random effects structure,
> if the variability in true effects appears to be unrelated to ecosystem.)
> James
> On Fri, Feb 2, 2018 at 5:48 AM, Cesar Terrer Moreno <cesar.terrer at me.com>
> wrote:
>> I have calculated an effect size along a dataset of experiments
>> distributed worldwide through a mixed-effects meta-regression. The effect
>> size in the dataset depends on climate (y ~ temperature + precipitation):
>>         meta <- rma(es, var, data=df, mods= ~ 1 + precip + temp)
>> As I have data for temperature and precipitation for virtually all points
>> on Earth, I have upscaled this effect and standard error (SE) globally,
>> that is, applying the model to predict the effect for all points on Earth,
>> thus creating a gridded map:
>>         s <- stack(temperature, precipitation) # Raster stack of the maps
>> of temperature and precipitation
>>         s.df <- as.data.frame(s,xy=TRUE) # Transform raster stack to
>> data.frame with coordinates
>>         # predict ES and SE for all points with temperature and
>> precipitation data
>>         pred <- predict(meta, newmods = cbind(s.df$precipitation,
>> s.df$temperature))
>> The resulting "upscaled" predictions of SE thus represent the
>> meta-analysis derived uncertainty. What this SE does not reflect is the
>> uncertainty derived from the fact that the number of experiments in the
>> dataset used to run the meta-regression is obviously limited, yet I
>> predicted the effect globally, even in areas not covered by experiments. In
>> particular, it is very obvious in the dataset that tropical ecosystems are
>> not well represented, with just a couple of experiments in the
>> meta-analysis. Therefore, the upscaled effect should be more uncertain in
>> tropical regions. In the current version, however, SE is not particularly
>> higher in the tropics than in other areas, because the model just depends
>> on climate, not ecosystem representativeness. I would like to add a further
>> level of uncertainty to SE based on the number of studies per type of
>> ecosystem in the dataset. The idea is that ecosystems that are poorly
>> sampled should have a higher SE than ecosystems with plenty of measurements
>> in the dataset.
>> Ecosystem type does not appear to explain variation in the effect, reason
>> I haven’t included it as a moderator. This could just be the consequence of
>> the low sample size of some ecosystem types. Regardless, ecosystem type
>> should be forced in the model somehow to account for the uncertainty
>> associated with the low sample size of some types of ecosystem.
>> I have tried this:
>>         meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1
>> + MAP + MAT)
>>         pred.mv <- predict(meta.mv,
>>                       newmods = cbind(s.df$precipitation,
>> s.df$temperature),
>>                       random= ~1|s.df$Biome)
>> But the predictions of SE in pred.mv still show rather low SE values in
>> poorly sampled areas.
>> How can I force the model to predict larger errors in ecosystem types
>> with low sample size? Should I include ecosystem type as a random or fixed
>> effect? How should the “predict" function be called?
>>         [[alternative HTML version deleted]]
>> _______________________________________________
>> R-sig-meta-analysis mailing list
>> R-sig-meta-analysis at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis

	[[alternative HTML version deleted]]

More information about the R-sig-meta-analysis mailing list