[R-meta] Accounting for uncertainty in a mixed-effects regression

Tue Feb 6 16:46:23 CET 2018

I noticed predict.rma works fine when I exclude the interaction from both the model and newmods. I have thus the feeling there might be something wrong with “CO2dif”, which I have forced to always have a value of 300, instead of varying as the other moderators, because I want a static picture at CO2dif=300. Therefore, CO2dif is not variable.  

I have reformulated my question to reflect this problem.

This is my model

### MODEL ###
summary(ECMmv <- rma(es, var, data=df.ecm, mods= ~ MAP + MAT*CO2dif + Biome))

And the newmods matrix, built from a dataset (s.df.ecm) that contains data for all predictors on a per pixel basis, with CO2dif as a constant 300.

### NEWMODS ###
s.df.ecm <- s.df %>% dplyr::select(x, y, MAT = temperature, MAP = precipitation, Biome) %>% 
  mutate(CO2dif = 300)
ECM.matrix <- model.matrix(~ MAP + MAT*CO2dif + Biome, data=s.df.ecm)

# Example of ECM.matrix:

 ECM.matrix[47881:47885,-1]
           MAP MAT CO2dif BiomeBoreal_Forest BiomeTropical_Forest MAT:CO2dif
47881 45.13125 2.2    300                  0                    0        660
47882 47.75000 2.2    300                  0                    0        660
47883 51.20000 2.2    300                  0                    0        660
47884 55.93125 2.2    300                  0                    0        660
47885 61.94375 2.2    300                  0                    0        660

### PREDICT ###
ECMpred.mv <- predict(ECMmv, newmods = ECM.matrix[,-1])
Error in predict.rma(ECMmv, newmods = ECM.matrix[, -1]) : 
 Multiple matches for the same variable name.

This doesn’t work. How can I use predict with a model with an interaction in which one of the terms is fixed at a certain value?
Thanks
> On 6 Feb 2018, at 12:14, Cesar Terrer Moreno <cesar.terrer at me.com> wrote:
> 
> Dear Wolfgang, 
> 
> I seem to be close to running your suggested analysis, but I get an error in predict.rma:
> 
> ### MODEL ###
> summary(ECMmv <- rma(es, var, data=df.ecm, mods= ~ MAP + MAT*CO2dif + Biome))
>> colnames(ECMmv$X)[-1]
> [1] "MAP"                  "MAT"                  "CO2dif"              
> [4] "BiomeBoreal_Forest"   "BiomeTropical_Forest" "MAT:CO2dif” 
> 
> ### GLOBAL DATASET ###
> ECM.matrix <- model.matrix(~ MAP + MAT*CO2dif + Biome, data=s.df.ecm)
>> colnames(ECM.matrix)[-1]
> [1] "MAP"                  "MAT"                  "CO2dif"              
> [4] "BiomeBoreal_Forest"   "BiomeTropical_Forest" "MAT:CO2dif”
> 
> ### PREDICT ###
> ECMpred.mv <- predict(ECMmv, newmods = ECM.matrix[,-1])
> Error in predict.rma(ECMmv, newmods = ECM.matrix[, -1]) : 
>  Multiple matches for the same variable name.
> 
> As you can see, the names of the columns match in the model and newmods, so I am not really sure what this error means.
> 
> Many thanks for your help.
> Best wishes,
> Cesar
> 
>> On 5 Feb 2018, at 15:22, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>> 
>> Careful with using rma.mv(). It requires that you specify all random effects, including the one that is automatically added in RE models fitted with the rma() function. See:
>> 
>> http://www.metafor-project.org/doku.php/tips:rma.uni_vs_rma.mv
>> 
>> So right now, the model below is not a random-effects model.
>> 
>> Leaving this aside, if you have a dataset that includes the same predictor variables (with the same names) as the model that you fitted, then you should be able to just do this:
>> 
>> model.matrix(~ 1 + MAP + MAT*CO2dif + Biome, data=s.df)
>> 
>> to construct the matrix you need to give to 'newmods' of predict(). If the variable names differ, the safest approach is to just rename the variables in 's.df' so they match exactly the names of the predictors you used in rma.mv().
>> 
>> Best,
>> Wolfgang
>> 
>>> -----Original Message-----
>>> From: Cesar Terrer Moreno [mailto:cesar.terrer at me.com]
>>> Sent: Monday, 05 February, 2018 13:26
>>> To: Viechtbauer Wolfgang (SP)
>>> Cc: r-sig-meta-analysis at r-project.org
>>> Subject: Re: [R-meta] Accounting for uncertainty in a mixed-effects
>>> regression
>>> 
>>> Hi Wolfgang,
>>> 
>>> Thanks for your response.
>>> 
>>> Yes, including “Biome” as a moderator seems like a reasonable approach to
>>> force the model to predict larger SE values when the sample size in a
>>> certain biome is low.
>>> 
>>> My model would thus be:
>>> 
>>> ECMmv <- rma.mv(es, var, data=ecm.df, mods= ~ 1 + MAP + MAT*CO2dif +
>>> Biome)
>>> 
>>> Biome has six levels:
>>>> levels(s.df$Biome)
>>> [1] "Grassland"        "Boreal_Forest"    "Cropland"         "Shrubland"
>>> [5] "Temperate_Forest" "Tropical_Forest"
>>> 
>>> Now I don’t know how to use the predict function using a dataset that
>>> includes data for all predictors on a per pixel basis:
>>> 
>>>> head(s.df)
>>>       x      y temperature precipitation     Biome CNr
>>> 1 -179.875 89.875          NA            NA Grassland  NA
>>> 2 -179.625 89.875          NA            NA Grassland  NA
>>> 3 -179.375 89.875          NA            NA Grassland  NA
>>> 4 -179.125 89.875          NA            NA Grassland  NA
>>> 5 -178.875 89.875          NA            NA Grassland  NA
>>> 6 -178.625 89.875          NA            NA Grassland  NA
>>> 
>>> CO2inc <- 300 # ppm
>>> 
>>> ECMpred.mv <- predict(ECMmv,
>>>                    newmods = cbind(s.df$precipitation,
>>> s.df$temperature, CO2inc, s.df$temperature*CO2inc,s.df$Biome))
>>> 
>>> How would you configure “predict” to predict ES and SE per row as a
>>> function of Biome type?
>>> Thanks a lot
>>> César
>>> 
>>>> On 4 Feb 2018, at 13:57, Viechtbauer Wolfgang (SP)
>>> <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>>>> 
>>>> Yes, I just wanted to tie up those loose ends before getting to the
>>> more recent discussion.
>>>> 
>>>> As James indicated, you cannot predict for a level that was not
>>> included in the original dataset. Also, factors are turned into a bunch
>>> of dummy variables, so if you want to use predict(), you have to specify
>>> the values of the dummy variables instead of including a factor/character
>>> variable in 'newmods'. See:
>>>> 
>>>> http://www.metafor-project.org/doku.php/tips:testing_factors_lincoms
>>>> 
>>>> As for the actual problem:
>>>> 
>>>> In levels of 'Biome' with a small number of studies, the SE should tend
>>> to be larger (all else equal, a larger k for a level will lead to a
>>> decrease in the SE for the level). So in that sense, the model should
>>> reflect the larger uncertainty in levels of 'Biome' with little
>>> information.
>>>> 
>>>> For example:
>>>> 
>>>> library(metafor)
>>>> dat <- get(data(dat.bcg))
>>>> dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg,
>>> data=dat.bcg)
>>>> res <- rma(yi, vi, mods = ~ alloc - 1, data=dat)
>>>> res
>>>> plot(c(table(dat$alloc)), res$se, pch=19)
>>>> 
>>>> So, before thinking about other methods, how does that pan out in your
>>> data?
>>>> 
>>>> Best,
>>>> Wolfgang
>