[R-meta] Nonlinear meta-regression with factorial moderator

Tue Apr 3 23:00:16 CEST 2018

The dataset didn't make it through, but I got it from you via email.

Fitting non-linear models isn't trivial and it's important to plot the data/results:

with(am.df, plot(CNr, es, pch=19, col=
   ifelse(am.df$BiomeGrassland==1, "black",
   ifelse(am.df$BiomeCropland==1,  "red",
   ifelse(am.df$BiomeShrubland==1, "blue",
   ifelse(am.df$BiomeTemperate_Forest==1, "green",
   ifelse(am.df$BiomeTropical_Forest==1, "orange", NA)))))))
xs <- seq(3, 23, by=.1)
lines(xs, nlfun(xs, 0,0,0,0, res$par[1], res$par[2], res$par[3], res$par[4], res$par[5], res$par[6]), col="black")
lines(xs, nlfun(xs, 1,0,0,0, res$par[1], res$par[2], res$par[3], res$par[4], res$par[5], res$par[6]), col="red")
lines(xs, nlfun(xs, 0,1,0,0, res$par[1], res$par[2], res$par[3], res$par[4], res$par[5], res$par[6]), col="blue")
lines(xs, nlfun(xs, 0,0,1,0, res$par[1], res$par[2], res$par[3], res$par[4], res$par[5], res$par[6]), col="green")
lines(xs, nlfun(xs, 0,0,0,1, res$par[1], res$par[2], res$par[3], res$par[4], res$par[5], res$par[6]), col="orange")

That shows problems with the estimate of the BiomeShrubland intercept. Adjusting the starting values seems to fix this:

res <- optim(par=c(8,0,4,0,0,0.4,log(.01)), llfun, yi=am.df$es, vi=am.df$var, x=am.df$CNr,
             B2=am.df$BiomeCropland, B3=am.df$BiomeShrubland,
             B4=am.df$BiomeTemperate_Forest, B5=am.df$BiomeTropical_Forest,
             hessian=TRUE)

But in the end, this doesn't fix the problem with the hessian. And yes, I think you are overfitting. There are only 2 points for BiomeShrubland and 3 points for BiomeTemperate_Forest, so I think you are asking a bit much from these data.

Best,
Wolfgang

-----Original Message-----
From: Cesar Terrer Moreno [mailto:cesar.terrer at me.com] 
Sent: Sunday, 01 April, 2018 12:23
To: Viechtbauer Wolfgang (SP)
Cc: r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Nonlinear meta-regression with factorial moderator

Hi Wolfgang,

Thanks for your response.

I have followed your code and adapted for a factor with 5 levels (data attached)

df <- read.csv("~/OneDrive/OneDrive - Universitat Autònoma de Barcelona/FACEreview/data/AGB_effects.csv",na.strings=c("",NA))
df <- df[complete.cases(df$id),]
df2 <- filter(df, Site.Name != "Nevada FACE",!is.na(CNr)) %>% 
  rename(es=yi, var=vi)
df2$var <- ifelse(is.na(df2$var),max(df2$var,na.rm=T), df2$var)

am.df <- filter(df2,Myc=="AM") %>% mutate(Biome = relevel(Biome, ref= "Grassland")) %>% droplevels()

# Biome has 5 levels. Then create 4 dummy variables (so level 1 is the reference level). Let's call these dummies B2, B3, B4, B5
library(dummies)
am.df <- dummy.data.frame("Biome",data=am.df, sep = "")

nlfun <- function(x, B2, B3, B4, B5, p1, p2, p3, p4, p5, p6)
  (p1 + B2*p2 + B3*p3 + B4*p4 + B5*p5) * exp(-p6*x)

# optimization function
llfun <- function(par, yi, vi, x, B2, B3, B4, B5, random=TRUE) {
  p1 <- par[1]
  p2 <- par[2]
  p3 <- par[3]
  p4 <- par[4]
  p5 <- par[5]
  p6 <- par[6]
  if (random) {
    tau2 <- exp(par[7])
  } else {
    tau2 <- 0
  }
  mu <- nlfun(x, B2, B3, B4, B5, p1, p2, p3, p4, p5, p6)
  -sum(dnorm(yi, mean=mu, sd=sqrt(vi + tau2), log=TRUE))
}

# optimize
res <- optim(par=c(8,0,0,0,0,0.4,log(.01)), llfun, yi=am.df$es, vi=am.df$var, x=am.df$CNr, 
            B2=am.df$BiomeCropland, B3=am.df$BiomeShrubland, 
            B4=am.df$BiomeTemperate_Forest, B5=am.df$BiomeTropical_Forest, 
             hessian=TRUE)
# back-transform log(tau2) to tau2
res$par[7] <- exp(res$par[7])
tau2_ME <- res$par[7]

# fit model with tau2=0
res0 <- optim(par=c(8,0,0,0,0,0.4,log(.01)), llfun, yi=am.df$es, vi=am.df$var, x=am.df$CNr, 
              B2=am.df$BiomeCropland, B3=am.df$BiomeShrubland, 
              B4=am.df$BiomeTemperate_Forest, B5=am.df$BiomeTropical_Forest,
              random=FALSE, hessian=TRUE)

# LRT of H0: tau2=0
x2 <- -2 * (res0$value - res$value)
x2 ### test statistic
pchisq(x2, df=1, lower.tail=FALSE) ### p-value

# compute standard errors
vb <- chol2inv(chol(res$hessian))

However, in this last step, I get the following error:
Error in chol.default(res$hessian) : 
  the leading minor of order 6 is not positive definite

Google says this may be due to overfitting. What can I do?

Thanks
César