[R-meta] Accounting for uncertainty in a mixed-effects regression

Fri Feb 2 19:58:46 CET 2018

Cesar,

Can you send a reproducible example of the issue?

As far as there being more biomes in the world than there are in your data,
that is a really modeling issue (a question about the science), rather than
an issue with the predict function: how should you extrapolate from the
biomes for which you have experimental data to the biomes that are not
observed in the data?

James

On Fri, Feb 2, 2018 at 12:19 PM, Cesar Terrer Moreno <cesar.terrer at me.com>
wrote:

> Dear James,
>
> Thanks, this is very helpful.
>
> However, I can’t manage to make the predict function work. My entire model
> is a bit more complicated than in my original question, also including an
> interaction as moderator and Biome as you just suggested :
>
> meta.mv <- rma.mv(es, var, data=df, mods= ~ 1 + MAP + MAT*CO2dif + Biome))
>
> Now, when I use predict:
> # correspondence of variables between df (meta-analysis database) and s.df
> (global data of the predictors to predict the effect globally) is:
> # MAT == temperature, MAP == precipitation, Biome == Biome, CO2inc = 300
>
> pred.mv <- predict(meta.mv,  newmods = cbind(s.df$precipitation,
> s.df$temperature, CO2inc, s.df$temperature*CO2inc,s.df$Biome))
>
> Error in predict.rma(ECMmv, newmods = cbind(s.df$precipitation,
> s.df$temperature,  :
>   Dimensions of 'newmods' do not match dimensions of the model.
>
> I guess this error is because i) Biome is categorical, and ii) number of
> factor levels in df$Biome and s.df$Biome is different:
>
> > levels(df$Biome)
> [1] "Boreal_Forest"    "Cropland"         "Grassland"        "Shrubland"
>
> [5] "Temperate_Forest" "Tropical_Forest”
>
> > levels(s.df$Biome)
>  [1] ""                 "Boreal_Forest"    "Cropland"         "Grassland"
>
>  [5] "Mixed"            "Nodata"           "Other"            "Shrubland"
>
>  [9] "Temperate_Forest" "Tropical_Forest"
>
> This is because there are “biomes” in the world (s.df) that are not
> included in the dataset of experiments of the meta-analysis (s), although
> all Biomes in s are included in s.df with the same name.
>
> How can I make the predict function work? Thanks
> César
>
> On 2 Feb 2018, at 16:38, James Pustejovsky <jepusto at gmail.com> wrote:
>
> Cesar,
>
> I think you will need to include ecosystem as a predictor (i.e., in the
> fixed effect part of the model, not just in the random effects structure)
> in order to account for the greater uncertainty about tropical ecosystems.
> Without doing so, the predictions for the tropics will be generated based
> on ALL of the ecosystems in the model (controlling for precipitation and
> temperature), and so their uncertainty will be driven by the total sample
> size. (It would even be possible to exclude all of the data from tropical
> ecosystems, fit the model, and yet still generate predictions for the
> tropics.)
>
> So I would recommend putting it in, as something like:
> meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1 + MAP +
> MAT + Biome)
> (Or perhaps it's not necessary to put it in the random effects structure,
> if the variability in true effects appears to be unrelated to ecosystem.)
>
> James
>
>
> On Fri, Feb 2, 2018 at 5:48 AM, Cesar Terrer Moreno <cesar.terrer at me.com>
> wrote:
>
>> I have calculated an effect size along a dataset of experiments
>> distributed worldwide through a mixed-effects meta-regression. The effect
>> size in the dataset depends on climate (y ~ temperature + precipitation):
>>
>>         meta <- rma(es, var, data=df, mods= ~ 1 + precip + temp)
>>
>> As I have data for temperature and precipitation for virtually all points
>> on Earth, I have upscaled this effect and standard error (SE) globally,
>> that is, applying the model to predict the effect for all points on Earth,
>> thus creating a gridded map:
>>
>>         s <- stack(temperature, precipitation) # Raster stack of the maps
>> of temperature and precipitation
>>         s.df <- as.data.frame(s,xy=TRUE) # Transform raster stack to
>> data.frame with coordinates
>>
>>         # predict ES and SE for all points with temperature and
>> precipitation data
>>         pred <- predict(meta, newmods = cbind(s.df$precipitation,
>> s.df$temperature))
>> The resulting "upscaled" predictions of SE thus represent the
>> meta-analysis derived uncertainty. What this SE does not reflect is the
>> uncertainty derived from the fact that the number of experiments in the
>> dataset used to run the meta-regression is obviously limited, yet I
>> predicted the effect globally, even in areas not covered by experiments. In
>> particular, it is very obvious in the dataset that tropical ecosystems are
>> not well represented, with just a couple of experiments in the
>> meta-analysis. Therefore, the upscaled effect should be more uncertain in
>> tropical regions. In the current version, however, SE is not particularly
>> higher in the tropics than in other areas, because the model just depends
>> on climate, not ecosystem representativeness. I would like to add a further
>> level of uncertainty to SE based on the number of studies per type of
>> ecosystem in the dataset. The idea is that ecosystems that are poorly
>> sampled should have a higher SE than ecosystems with plenty of measurements
>> in the dataset.
>>
>> Ecosystem type does not appear to explain variation in the effect, reason
>> I haven’t included it as a moderator. This could just be the consequence of
>> the low sample size of some ecosystem types. Regardless, ecosystem type
>> should be forced in the model somehow to account for the uncertainty
>> associated with the low sample size of some types of ecosystem.
>>
>> I have tried this:
>>
>>         meta.mv <- rma.mv(es, var, data=df, random= ~1|Biome, mods= ~ 1
>> + MAP + MAT)
>>         pred.mv <- predict(meta.mv,
>>                       newmods = cbind(s.df$precipitation,
>> s.df$temperature),
>>                       random= ~1|s.df$Biome)
>>
>> But the predictions of SE in pred.mv still show rather low SE values in
>> poorly sampled areas.
>>
>> How can I force the model to predict larger errors in ecosystem types
>> with low sample size? Should I include ecosystem type as a random or fixed
>> effect? How should the “predict" function be called?
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-meta-analysis mailing list
>> R-sig-meta-analysis at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
>
>
>

	[[alternative HTML version deleted]]