[R-sig-ME] error when modeling beta-binomial distributed data using glmmTBM
Sabrina Gavini
@@brin@g50n @ending from gm@il@com
Wed Sep 5 17:49:57 CEST 2018
Dear Sir.
I am trying to fit a mixed effect model to asses for effects upon the rate
of germinated polen grains. I started with a binomial distribution with a
model structure like this:
glmer(cbind(NGG,NGNG) ~ RH3*Altitude + AbH + Date3 + (1 |
Receptor/Code/Plant) +
(1 | Mountain/Community), data=database,
family="binomial",
control = glmerControl(optimizer="bobyqa"))
Where NGG is the number of successes (germinated grains per stigma) can
vary from 0 to e.g. 55.
NGNG is the number of failures (non-germinated grains) 0 to e.g. 80.
Here is the output from summary() and Anova()
Random effects:
Groups Name Variance Std.Dev.
Plant:(Code:Receptor) (Intercept) 0.4713 0.6865
Code:Receptor (Intercept) 0.7417 0.8612
Receptor (Intercept) 0.8514 0.9227
Community:Mountain (Intercept) 0.0000 0.0000
Mountain (Intercept) 0.2003 0.4475
Number of obs: 8082, groups:
Plant:(Code:Receptor), 2184; Code:Receptor, 436; Receptor, 88;
Community:Mountain, 9; Mountain, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.756624 0.408515 -1.852 0.064007 .
RH31 0.128719 0.031442 4.094 4.24e-05 ***
RH32 0.207303 0.059691 3.473 0.000515 ***
RH33 -0.011154 0.122811 -0.091 0.927635
Altitude1800 -0.142022 0.121090 -1.173 0.240851
Altitude2000 0.005628 0.172520 0.033 0.973974
AbH -0.006041 0.002738 -2.206 0.027363 *
Date3february1 -0.162379 0.301456 -0.539 0.590129
Date3february2 -0.032708 0.320283 -0.102 0.918660
Date3january1 -0.358065 0.301815 -1.186 0.235475
Date3january2 -0.367096 0.296306 -1.239 0.215379
Date3march1 0.938912 0.397963 2.359 0.018310 *
Date3march2 -0.497544 0.561841 -0.886 0.375855
RH31:Altitude1800 -0.092963 0.047143 -1.972 0.048616 *
RH32:Altitude1800 -0.238150 0.084216 -2.828 0.004686 **
RH33:Altitude1800 0.007507 0.184352 0.041 0.967519
RH31:Altitude2000 -0.264065 0.055623 -4.747 2.06e-06 ***
RH32:Altitude2000 -0.335297 0.133998 -2.502 0.012341 *
RH33:Altitude2000 0.173064 0.299377 0.578 0.563209
Analysis of Deviance Table (Type III Wald chisquare tests)
Response: cbind(NGG, NGNG)
Chisq Df Pr(>Chisq)
(Intercept) 3.4304 1 0.064007 .
RH3 23.0569 3 3.930e-05 ***
Altitude 1.6657 2 0.434800
AbH 4.8678 1 0.027363 *
Date3 23.6524 6 0.000605 ***
RH3:Altitude 30.8681 6 2.686e-05 ***
The issue is data seems to be over-dispersed, as indicated by this
function...
overdisp_fun <- function(model) {
vpars <- function(m) {
nrow(m)*(nrow(m)+1)/2
}
model.df <- sum(sapply(VarCorr(model), vpars)) + length(fixef(model))
rdf <- nrow(model.frame(model))-model.df
rp <- residuals(model, type = "pearson") # computes pearson residuals
Pearson.chisq <- sum(rp^2)
prat <- Pearson.chisq/rdf
pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)}
The output of this function upon my model was:
chisq = 1.334567e+04, ratio = 1.656201e+00, rdf = 8.058000e+03, p =
3.845911e-268
So I decided to try a beta-binomial in glmmTMB as follows (its important to
keep this hierarchical structure):
glmmTMB(cbind(NGG,NGNG) ~ RH3*Altitude + AbH + Date3 + (1 |
Receptor/Code/Plant) +
(1 | Mountain/Community), data=database,
family=betabinomial(link = "logit"),
na.action = na.omit)
When I run it this error is displayed:
Error in nlminb(start = par, objective = fn, gradient = gr, control =
control$optCtrl) : (converted from warning) NA/NaN function evaluation
Is there something wrong in the model writing? I already checked for
posible issues in (
http://rstudio-pubs-static.s3.amazonaws.com/263877_d811720e434d47fb8430b8f0bb7f7da4.html)
but did not find any solution yet. I Know that kind of error sometimes came
out as a "warning" but the model still runs... but this was not the case.
The reason why I am scared by the output of my first binomial model, and
for which I try to run a beta-binomial, is that, according to the Anova (),
there is a significant interaction between two factors (richness of
heterospecific pollen (factor of 4 levels) and elevation (three-level
factor)) ... but according to the plots and lsmeans ( ) that interaction
does not seem to be real ... and possibly it is a consequence of an
inappropriate model owed to over-dispersion.
Just a few clarifications of my system so that you understand why I do a
binomial ...
"what I want to evaluate is the proportion of germinated grains; since I
saw many models using the number of polen grains as "succeses" and
non-germinated as "failure" (hence a proportion of succeses from a total
amount of tries (no of grains)). I used a poisson for another model when I
wanted to test for the "total number of polen grains". Here is the "rate of
polen germinated", so poisson is not an option here. Always the response
variable (germinated grains) is equal or less to the total number of grains
(number of attempts). So a case where the response variable is 55,
denominator is equal (55, succes rate of "1") or exceeds the count (e.g.
55/80, 0.68 of succes rate)".
Thanks
Sabrina
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list