[R-sig-ME] [RE] Response time modeling: Warning message with non-normal RAs

Fri Oct 19 15:28:07 CEST 2018

Hello,

Just a personnal opinion considering the point of interaction
depending on the scale. I think this is more an issue of what model do
you believe in and its mathematical formulation, and how to handle
non-linearities, than a problem of link function (or data
transformation), especially if you consider an interaction « simply »
as a product of predictor variables term in a linear model.

That is, considering your RT variable for instance, and two predictors
for sake of simplicity, you simply cannot have simultaneously

RT = µ0 + a * factor1 + b * factor2

and

log RT = µ0' + a' * factor1 + b' * factor2

(unless a = b = 0, or both factor1 and factor2 are indicator
variables, that is associated to binary qualitative predictors)

so when trying to apply a linear model to the scale where the
relationship is not linear, some « spurious » interactions (defined as
product of factor1 and factor2) seem to appear because the model will
add these terms to handle the intrinsic non-linearity of the model
induced by the scale change.

That does not mean the answer is easy, but that using this argument to
avoid using non-linear link functions seems strange unless you have
strong arguments to affirm that the relationship is indeed linear in
the original scale. If you think the model is linear in the log scale,
then it is the model with the linear link that is doubtful, and the
associated (lack of) interaction « spurious ».

My guess would be, in your particular case, that a gamma model with
identity link and linear model on this scale simply does not hold,
because it violates the constraints on the gamma distribution
parameters, and that having random effects with a large variance may
be enough to have cases where the model, even quite simple, gives
irrealistic results. Especially with a Gaussian random effect, that
can have negative values...

On Thu, Oct 18, 2018 at 01:04:36PM +0000, Baud-Bovy Gabriel wrote:
« Dear Ulf,
« 
« Thank you for the reply. I think that the approach is very reasonable 
« but I am
« not fully comfortable with it for reasons that I detail below your comments.
« 
« On 10/17/2018 6:28 PM, Ulf Köther wrote:
« > Dear Gabriel,
« >
« > as no one of the real professionals stepped in yet, I will try to
« > provide some thoughts:
« >
« > (1) In my experience, Gamma GLMMs with an identity link are very buggy
« > all the time, so normally, when I am confronted with positive continuous
« > data, I mostly never consider them in the first place...why not use
« > another link function when using GLMMs, because that's exactly the goal
« > they were invented for: modeling data that does not belong to a normal
« > distribution by using an appropriate link function.
« One motivation for my post was to ask for insights about why Gamma GLMMS 
« with identity link
« are so difficult to fit even with a very simple intercept only model.  
« My guess is that it is related to the
« fact that the distribution of the subject's mean are not gaussian but I 
« would very much appreciate
« opinion of the experts.
« 
« > (2) I see why you don't want to transform the raw RT data and use an
« > ordinary LMM, but in my opinion, using a Gamma GLMM with a log-Link is a
« > valid option. Especially because you can insert the raw RT as a DV and
« > then model it on the scale of the linear predictor and after that, get
« > all parameter estimates and (which is much more valuable in my opinion)
« > the predicted values on the scale of the dependent variable without any
« > hard work, just use the invert link function (e.g., use the effects
« > package with lme4 and get the raw RT predictions). The interpretation if
« > such a model is much easier as you can directly get the predictions on
« > the scale of the DV.
« >
« > Is it totally reasonable with regards to power? That's another question
« > that a real RT specialist should answer...
« The log link solves indeed the fitting issue. However, my reason for not 
« doing it, which is
« discussed in Lo & Andrew 2015 paper, is the problem of "scale dependent" 
« interactions,
« which they discuss in their section on Transform or Not to Transform.
« https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01171/full
« 
« The "existence" of an such an interaction and, its statistical 
« significance, depends on
« the scale.  If I am not mistaken, being able to transform back the 
« coefficients on the raw RTs scale is a
« separate issue.
« 
« >
« > (2) I know the article you mentioned and skimmed it again. I really
« > wonder why they do not mention a log-link function as a valid option and
« > just compare the LMM with the transformed DV against a Gamma GLMM with
« > the identity link or an inverse Gaussian with identity link. It seems to
« > my that the Gamma GLMM with a log link is never really discussed. But,
« > as they also use the inverse Gaussian with the following link function
« > in lme4:
« >
« > invfn <- function() {
« >      ## link
« >      linkfun <- function(y) -1000/y
« >      ## inverse link
« >      linkinv <- function(eta)  -1000/eta
« >      ## derivative of invlink wrt eta
« >      mu.eta <- function(eta) { 1000/(eta^2) }
« >      valideta <- function(eta) TRUE
« >      link <- "-1000/y"
« >      structure(list(linkfun = linkfun, linkinv = linkinv,
« >                     mu.eta = mu.eta, valideta = valideta,
« >                     name = link),
« >                class = "link-glm")
« > }
« >
« > Why not try this when you want to go along the line the authors suggest?
« > The parameter estimates in their analyses seem really comparable between
« > Gamma and inverse Gaussian GLMMs to me. Just have a look at the
« > supplement! ;-)
« 
« While I think that the problem of "scale dependent"  interaction 
« remains, I probably should explore more
« this possibility because the transformed scale might  justified on the 
« theoretical basis. In fact, I think
« that inverse transformation might be justified on basis of the LATER's 
« model or similar
« (https://www.sciencedirect.com/science/article/pii/S0149763415301226). 
« In this case,
« the coefficients represent the rate of accumulation of evidence. My 
« understanding
« of this model is that gaussian noise is added to the rate of 
« accumulation.  Typically, the inverse
« is applied to the RTs and the transformed data are analyzed with LMs or 
« LMMs.
« 
« Lo & Andrew use the inverse link function with Gamma or Inverse Gaussian 
« distribution but they don't
« really discuss the assumptions behind choosing the inverse link 
« function  and the inverse gaussian. From what I
« know, the inverse gaussian distribution is not the reciprocal of the 
« normal distribution and I see only a
« partial  connection between the LATER model and the proposed GLMM model.
« 
« I would appreciate any further comment from you and other RT experts on 
« this issue.
« 
« Best,
« 
« Gabriel
« 
« P.S.  Paul Johnson's wrote an interesting note on Gamma GLM with the 
« inverse link
« (i.e. when modeling relationships of the form mu = a + b/x or 1/mu = a + 
« b/x).
« http://pj.freefaculty.org/guides/stat/Regression-GLM/Gamma/GammaGLM-01.pdf
« His point was that using GLM rather than OLS  was important when there 
« is a transformation
« but using Normal  instead of Gamma  distributed dependent variable with 
« GLM does not
« result in disaster.  I think that the question when analyzing RTs is 
« different since there
« is reason to use Gamma without a transformation, a case that is not 
« discussed in Paul
« Johnson's note.
« 
« > Try to play around with your generated data with the log-link function
« > and with the inverse Gaussian to see which one recovers your parameters
« > best and go from there...!
« >
« > Greetings,
« >
« > Ulf
« >
« >
« >
« >
« > Am 17.10.2018 um 11:06 schrieb Baud-Bovy Gabriel:
« >> My apologies for sending this post again.  I am not sure that it went
« >> through because I did not
« >> receive it myself (though I think my options should have let me receive
« >> it).  Also, there was
« >> a line missing due to hurried cut-and-paste in the example below.
« >>
« >> Dear all,
« >>
« >> I have a dataset with response times of 200 participants measured 36
« >> times in different conditions.
« >> The RTs distribution is right skewed as it is generally the case for
« >> RTs. I would like to avoid transform them
« >> (see Lo & Andrew
« >> (2015)](https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01171/full).
« >> However, I often encounter warning messages with this dataset when using
« >> a GLMM with Gamma
« >> distribution and identity link. The problem is not caused by an over
« >> parametrization as it arises
« >> even in the case in a intercept only model.
« >>
« >>   > fit <- glmer(choice.lat ~ 1 + (1|su), data=dat,
« >> family=Gamma(link="identity"))
« >> Warning messages:
« >> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
« >>     Model failed to converge with max|grad| = 0.220808 (tol = 0.001,
« >> component 1)
« >> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
« >>     Model is nearly unidentifiable: very large eigenvalue
« >>    - Rescale variables?
« >>
« >>   > fit
« >> Generalized linear mixed model fit by maximum likelihood (Laplace
« >>     Approximation) [glmerMod]
« >>    Family: Gamma  ( identity )
« >> Formula: choice.lat ~ 1 + (1 | su)
« >>      Data: dat
« >>         AIC       BIC    logLik  deviance  df.resid
« >>    22629.26  22649.78 -11311.63  22623.26      6914
« >> Random effects:
« >>    Groups   Name        Std.Dev.
« >>    su       (Intercept) 0.8477
« >>    Residual             0.5721
« >> Number of obs: 6917, groups:  su, 200
« >> Fixed Effects:
« >> (Intercept)
« >>         3.044
« >> convergence code 0; 2 optimizer warnings; 0 lme4 warnings
« >>
« >> The dispersion parameter of the estimated Gamma distribution is
« >> sigma(fit.glmer)^2 = 0.327
« >> with a population mean (Fixed effect) of circa  3.0 sec which fits
« >> reasonably well the RTs.
« >>
« >> After some poking around, I figured out that the problem probably comes
« >> from the
« >> fact that the participants means are NOT distributed normally.  In fact,
« >> their distribution
« >> is similarly skewed. Removing the two extreme subjects or using a log
« >> link function "solves" the convergence
« >> problem/ warning messages but I don't like either solution because the
« >> is no valid reason to exclude these subjects
« >> and using a log transform cause theoretical problems when one want to
« >> interpret
« >> effects of more complex models (see Lo & Andrew (2015)).
« >>
« >> I have seen many discussions of about warning message/convergence problem (
« >> e.g.
« >> https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html).
« >> The usual advice is to try with different optimizers and/or simplify the
« >> model, which do not help here (they all give
« >> similar warning messages; see below).  I have two questions:
« >>
« >> 1) Can somebody explain intuitively why non-normal random effects lead
« >> to these warning
« >>      messages ?  I would have naively assumed that it would simply lead
« >> to large a large variance estimate.
« >>
« >> 2) Is there something within the lme4 framework  to tackle this problem?
« >> Should I simply ignore the
« >>       warning messages ?
« >>
« >> 3) What principle approach would you suggest ? I have  seen that brms
« >> allow in principle to specify
« >>      non-normal random effect (I think that only student  is implemented
« >> so far) but I haven't tried yet.
« >>
« >> Here is a toy problem that reproduces the issue (it depends on the seed
« >> actually). Depending
« >> on the choice of the dispersion parameter for the REs, it is possible
« >> also to induce convergence
« >> problem.
« >>
« >> # artificial data set
« >> nsu <- 200
« >> nrep <- 36
« >> mu <- 3.0      # mean RT
« >> phi.su <- 0.8  # dispersion participants
« >> phi <- 0.3     # dispersion
« >> d <- expand.grid(
« >>     su = paste0("S",1:nsu),
« >>     rep = 1:nrep,
« >>     mu =NA,
« >>     rt = NA)
« >>
« >> set.seed(125)
« >> # generate 200 Gamma distributed  participant means
« >> mu.su <- sort(rGamma(nsu, shape=1/phi.su, scale=phi.su*mu))
« >> d$mu <- mu.su[d$su]
« >> # add Gamma noise
« >> d$rt <- rGamma(nsu*nrep,shape=1/phi,scale=phi*d$mu)
« >>
« >> # fit intercept GLMM model (gives warning messages)
« >> fit <- glmer(rt ~  1 + (1|su), data=d, family=Gamma(link="identity"))
« >>
« >> # histogram of participant means and response times
« >> par(mfrow=c(1,2))
« >> hist(mu.su,n=50)
« >> hist(d$rt,n=200,xlim=c(0,10))
« >>
« >>   > all_fit(fit)
« >> bobyqa. : [OK]
« >> Nelder_Mead. : [OK]
« >> optimx.nlminb : [OK]
« >> optimx.L-BFGS-B : [OK]
« >> nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
« >> nloptwrap.NLOPT_LN_BOBYQA : [OK]
« >> nmkbw. : [OK]
« >> ....
« >> were 13 warnings (use warnings() to see them)
« >> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model failed to converge with max|grad| = 0.215844 (tol = 0.001,
« >> component 1)
« >> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model is nearly unidentifiable: very large eigenvalue
« >>    - Rescale variables?
« >> 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model failed to converge with max|grad| = 0.220808 (tol = 0.001,
« >> component 1)
« >> 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model is nearly unidentifiable: very large eigenvalue
« >>    - Rescale variables?
« >> 5: In optwrap(optimizer, devfun, start, rho$lower, control = control,  ... :
« >>     convergence code 1 from optimx
« >> 6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model failed to converge with max|grad| = 0.215713 (tol = 0.001,
« >> component 1)
« >> 7: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model is nearly unidentifiable: very large eigenvalue
« >>    - Rescale variables?
« >> 8: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model failed to converge with max|grad| = 0.102562 (tol = 0.001,
« >> component 1)
« >> 9: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
« >> ... :
« >>     Model failed to converge with max|grad| = 0.109779 (tol = 0.001,
« >> component 1)
« >> 10: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
« >> control$checkConv,  ... :
« >>     Model failed to converge with max|grad| = 0.109779 (tol = 0.001,
« >> component 1)
« >> 11: In optwrap(optimizer, devfun, start, rho$lower, control = control,
« >> ... :
« >>     convergence code 2 from nmkbw
« >> 12: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
« >> control$checkConv,  ... :
« >>     Model failed to converge with max|grad| = 0.220811 (tol = 0.001,
« >> component 1)
« >> 13: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
« >> control$checkConv,  ... :
« >>     Model is nearly unidentifiable: very large eigenvalue
« >>    - Rescale variables?
« >>
« >>
« >> Thank you,
« >>
« >> Gabriel
« >>
« >> --
« >> ---------------------------------------------------------------------
« >> Gabriel Baud-Bovy               tel.: (+39) 348 172 4045     (mobile)
« >> UHSR University                       (+39) 02 2643 3429 (laboratory)
« >> via Olgettina, 58                     (+39) 02 9175 1540  (secretary)
« >> 20132 Milan, Italy              email:       gabriel.baud-bovy using hsr.it
« >> ---------------------------------------------------------------------
« >>
« >>
« >>
« >> Rispetta l’ambiente: non stampare questa mail se non è necessario.
« >> Respect the environment: print this email only if necessary.
« >> _______________________________________________
« >> R-sig-mixed-models using r-project.org mailing list
« >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
« >>
« > --
« >
« > _____________________________________________________________________
« >
« > Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
« > Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Martina Saurin (komm.)
« > _____________________________________________________________________
« >
« > SAVE PAPER - THINK BEFORE PRINTING
« 
« 
« -- 
« ---------------------------------------------------------------------
« Gabriel Baud-Bovy               tel.: (+39) 348 172 4045     (mobile)
« UHSR University                       (+39) 02 2643 3429 (laboratory)
« via Olgettina, 58                     (+39) 02 9175 1540  (secretary)
« 20132 Milan, Italy              email:       gabriel.baud-bovy using hsr.it
« ---------------------------------------------------------------------
« 
« _______________________________________________
« R-sig-mixed-models using r-project.org mailing list
« https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
                                Emmanuel CURIS
                                emmanuel.curis using parisdescartes.fr

Page WWW: http://emmanuel.curis.online.fr/index.html