[R-sig-ME] Problems fitting GLMM and getting AIC

Mollie Brooks mollieebrooks at gmail.com
Tue Nov 21 13:23:57 CET 2017


Hi Mario,

In general, I believe it’s ok to simultaneously select (using information criteria) both the formula and distribution because the amount of variance and its shape around the mean depends on what predictors are in the formula. However, rounding the response variable to be integers is almost always a bad idea so I would stop considering the Poisson and negative binomial unless there is some reason why qPCR should really be an integer. Is this what the reviewers asked you to do? I would first check what distributions other people use for qPCR. 

You may want to model the zeros separately using a hurdle model with predictors (~ day * trtmnt + (1 | exp.ID)) on the zero vs non-zeros. This could be done in two steps with glmer: 
(1) do logistic regression to model zero vs non-zero qPCR (family=binomial) 
and then (2) for the subset of the data that has a non-zero qPCR, model it with family=Gamma, or the log of this subset of the qPCR could be Gaussian. It’s ok to select between the Gamma and Lognormal for this subset. Do not add 1 to the qPCR.

Based on the snippets of code you sent, it looks like you don’t have your data in a data.frame. Putting the data in a data frame will make the analyses easier and it will be easier to get the right subset for the positive part of the hurdle model. 

Assuming your data frame is called dat…
z1 = glmer((qpcr.bart > 0) ~ day * trtmnt + ( 1 | exp.ID), family=binomial, dat)

c1= glmer(qpcr.bart ~ day * trtmnt + ( 1 | exp.ID), family=Gamma, subset(dat, qpcr.bart > 0))
c2= lmer(log(qpcr.bart) ~ day * trtmnt + ( 1 | exp.ID), subset(dat, qpcr.bart > 0))
Then compare the AICc values of c1 and c2.

cheers,
Mollie

> On 21Nov 2017, at 11:56, Thierry Onkelinx <thierry.onkelinx at inbo.be> wrote:
> 
> Dear Mario,
> 
> A reproducible example makes it much easier to help you. See
> http://reprex.tidyverse.org/articles/reprex.html and
> https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> 
> I'm not sure if you can compare AICc when using different
> distributions. Rather choose the relevant distribution based on the
> dataset.
> 
> What is the relevant of the measurements on day 0? Are they 0 by
> definition? In that case I would omit then from the dataset.
> 
> Q4: see http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#model-definition
> 
> For your other questions: please post a reproducible example.
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx at inbo.be
> Kliniekstraat 25, B-1070 Brussel
> www.inbo.be
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> 
> Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in
> Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
> Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel.
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> 
> 
> 2017-11-15 16:50 GMT+01:00 Mario Garrido <gaiarrido at gmail.com>:
>> Dear lme4-users,
>> I am trying to fit my data to a generalized linear model with repeated
>> measures. My data consists on the relative amount of bacterial colonies in
>> hosts' blood provided by qPCR (qpcr.bart), I measure the amount of qpcr.bart at
>> 7 different time points (day) for each of the 19 individuals (exp.ID).
>> As random
>> factor I use (1 | exp.ID). At time zero all the values are 0 cause the
>> indivuals were still not infected.
>> As a fixed factor, I also include treatment (trtmnt): whether animals were
>> infected with 1 bacteria (Bartonella) or 2 (coinfected with another)
>> 
>> Attach is my data. I want to fit my data to different model distributions
>> (even if I know in advance that are not the correct distributionss) then
>> compare between them using AICc, as some reviewers asked me
>> 
>> #I have no problem to fit the model to *Gaussian*. even if I know that my
>> distribution is not normal, as analyses of the residuals show.
>> lmer(qpcr.bart ~ day * trtmnt + (1 | exp.ID)) ->mgauss
>> 
>> #I get an error when trying to fit to *Gamma*, even if I add 1 to
>> qpcr.bart cause
>> Gamma not accept zeroes
>> glmer((qpcr.bart+1) ~ day * trtmnt + ( 1 | exp.ID), family = Gamma) ->mGamma
>> Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit =
>> 100L,  :
>>  (maxstephalfit) PIRLS step-halvings failed to reduce deviance in
>> pwrssUpdate
>> 
>> *Q1:* is this error due to the low variability in the data and the use of
>> too many parameters?
>> 
>> #By round the numbers and avoid integers (dezimals) I can consider data as
>> count data and fit to *Poisson*. I also have the following problems and, in
>> addition, I do not know whether I can extract AIC or any other I-T index
>> glmer(round(qpcr.bart,digits=0) ~ day * trtmnt + ( 1 | exp.ID), family =
>> poisson) ->mPoisson
>> Warning messages:
>> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>>  Model failed to converge with max|grad| = 0.00263583 (tol = 0.001,
>> component 1)
>> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>>  Model is nearly unidentifiable: very large eigenvalue
>> - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
>> - Rescale variables?
>> 
>> *Q2:* What this error meaning? I know I have overdispersion, but how to
>> deal with it? is this problem related with overdispersion?
>> 
>> #Lastly, if I accept that I have overdispersion I can go to Quasi approac
>> (but do not give me AIC) or try to converge to *negative binomial *using
>> glmer.nb from MASS.
>> glmer.nb(qpcr.bart ~ day * trtmnt + ( 1 | exp.ID))
>> Also I have 50 or more warnings
>> Mainly these ones
>> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> ... :
>>  Model failed to converge with max|grad| = 0.00263583 (tol = 0.001,
>> component 1)
>> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> ... :
>>  Model is nearly unidentifiable: very large eigenvalue
>> - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
>> - Rescale variables?
>> 3: In theta.ml(Y, mu, weights = object at resp$weights, limit = limit,  ... :
>>  iteration limit reached
>> 4: In sqrt(1/i) : NaNs produced
>> 
>> *Q3:* At this point I am totally lost, what is the problem?
>> 
>> *Q4:* Another question regarding the random factor: What is the meaning of
>> using (day | exp.ID) instead of (1 | exp.ID) as a random factor?
>> 
>> Thanks, here is my data
>> 
>> 
>> *exp ID* *BM* *day* *qpcr bart* *trtmnt *
>> EA115 35.15 0 0 coinfection
>> EA115 33.51 11 24100 coinfection
>> EA115 35.01 21 10700 coinfection
>> EA115 35.34 31 1580 coinfection
>> EA115 30.89 42 228 coinfection
>> EA115 32.98 52 8.75 coinfection
>> EA115 33.56 63 23.9 coinfection
>> EA100 40.18 0 0 coinfection
>> EA100 41.64 11 1710000 coinfection
>> EA100 42.23 21 2830 coinfection
>> EA100 42.4 31 245 coinfection
>> EA100 43.03 42 5.81 coinfection
>> EA100 43.35 52 0 coinfection
>> EA100 44.32 63 16 coinfection
>> EA109 41.12 0 0 coinfection
>> EA109 39.98 11 1440000 coinfection
>> EA109 39.9 21 6710 coinfection
>> EA109 41.23 31 28.9 coinfection
>> EA109 41.58 42 10.8 coinfection
>> EA109 42.1 52 0 coinfection
>> EA109 42.38 63 14.1 coinfection
>> EA91 33.31 0 0 coinfection
>> EA91 33.74 11 752000 coinfection
>> EA91 32.68 21 3850 coinfection
>> EA91 34.21 31 1460 coinfection
>> EA91 33.29 42 28.6 coinfection
>> EA91 34.38 52 0 coinfection
>> EA91 34.05 63 10.6 coinfection
>> EA86 38.56 11 18600 coinfection
>> EA86 38.01 21 11300 coinfection
>> EA86 37.63 0 0 coinfection
>> EA86 38.5 31 74.4 coinfection
>> EA86 38.68 42 7.18 coinfection
>> EA86 38.93 52 0 coinfection
>> EA86 38.45 63 20 coinfection
>> EA90 51.94 0 0 coinfection
>> EA90 50.59 11 1000000 coinfection
>> EA90 50.19 21 4030 coinfection
>> EA90 51.34 31 134 coinfection
>> EA90 51.17 42 11.7 coinfection
>> EA90 52.54 52 0 coinfection
>> EA90 52.48 63 15 coinfection
>> EA102 36.54 0 0 coinfection
>> EA102 37.37 11 1270000 coinfection
>> EA102 37.49 21 12200 coinfection
>> EA102 38.34 31 2040 coinfection
>> EA102 36.51 42 35.2 coinfection
>> EA102 37.23 52 8.19 coinfection
>> EA102 37.57 63 0 coinfection
>> EA104 40.37 0 0 coinfection
>> EA104 38.74 11 1020000 coinfection
>> EA104 39.48 21 8910 coinfection
>> EA104 40.81 31 2530 coinfection
>> EA104 39.37 42 26.9 coinfection
>> EA104 40.78 52 14.4 coinfection
>> EA104 40.96 63 0 coinfection
>> EA116 33.31 0 0 coinfection
>> EA116 32.85 11 442000 coinfection
>> EA116 33.29 21 7840 coinfection
>> EA116 34.06 31 1890 coinfection
>> EA116 33.92 42 204 coinfection
>> EA116 34.39 52 14.5 coinfection
>> EA116 34.42 63 14.5 coinfection
>> EA118 40.06 0 0 coinfection
>> EA118 40.78 11 330000 coinfection
>> EA118 41.66 21 1790 coinfection
>> EA118 42.06 31 153 coinfection
>> EA118 42.17 42 7.09 coinfection
>> EA118 42.28 52 12.5 coinfection
>> EA118 43.07 63 13.7 coinfection
>> EA96 34.39 0 0 Bart
>> EA96 33.94 11 1930000 Bart
>> EA96 34.65 21 1500 Bart
>> EA96 35.66 31 16.6 Bart
>> EA96 32.8 42 0 Bart
>> EA96 32.22 52 13.3 Bart
>> EA96 33.01 63 11.8 Bart
>> EA92 45.53 0 0 Bart
>> EA92 47.55 11 0 Bart
>> EA92 47.81 21 13000 Bart
>> EA92 48.38 31 750 Bart
>> EA92 47.33 42 18.7 Bart
>> EA92 47.37 52 0 Bart
>> EA92 47.27 63 18.6 Bart
>> EA97 35.42 0 0 Bart
>> EA97 33.37 11 480000 Bart
>> EA97 33.02 21 4200 Bart
>> EA97 34.88 31 17.9 Bart
>> EA97 34.63 42 0 Bart
>> EA97 34.88 52 0 Bart
>> EA97 34.75 63 11.2 Bart
>> EA119 45.31 0 0 Bart
>> EA119 42.72 11 468000 Bart
>> EA119 43.31 21 2320 Bart
>> EA119 44.06 31 59.4 Bart
>> EA119 44.6 42 9.56 Bart
>> EA119 43.51 52 0 Bart
>> EA119 45.08 63 0 Bart
>> EA112 40.15 0 0 Bart
>> EA112 39.54 11 1650000 Bart
>> EA112 39.43 21 3190 Bart
>> EA112 40.72 31 15.7 Bart
>> EA112 38.94 42 0 Bart
>> EA112 38.8 52 0 Bart
>> EA112 38.42 63 19.1 Bart
>> EA93 30.54 0 0 Bart
>> EA93 29.32 11 892000 Bart
>> EA93 28.89 21 7940 Bart
>> EA93 28.77 31 36.7 Bart
>> EA93 27.92 42 14.8 Bart
>> EA93 28.71 52 0 Bart
>> EA93 28 63 14.9 Bart
>> EA114 46.01 0 0 Bart
>> EA114 41.42 11 0 Bart
>> EA114 41.09 21 4780 Bart
>> EA114 41.7 31 18.6 Bart
>> EA114 41.78 42 17.4 Bart
>> EA114 41.91 52 6.24 Bart
>> EA114 43.01 63 20.7 Bart
>> EA120 41.86 0 0 Bart
>> EA120 40.95 11 1110000 Bart
>> EA120 41.19 21 6450 Bart
>> EA120 41.96 31 8.2 Bart
>> EA120 41.66 42 0 Bart
>> EA120 41.86 52 10.3 Bart
>> EA120 41.15 63 14.5 Bart
>> EA105 32.4 0 0 Bart
>> EA105 34.35 11 1850000 Bart
>> EA105 34.53 21 13800 Bart
>> EA105 34.82 31 1290 Bart
>> EA105 33.8 42 64.4 Bart
>> EA105 35.13 52 14.7 Bart
>> EA105 34.44 63 17.7 Bart
>> EA82 36.86 0 0 Bart
>> EA82 36.84 11 698000 Bart
>> EA82 36.24 21 1470 Bart
>> EA82 37.11 31 10.4 Bart
>> EA82 36 42 0 Bart
>> EA82 36.3 52 5.23 Bart
>> EA82 36.53 63 15.1 Bart
>> 
>>        [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list