[R] Nonlinear logistic regression fitting
murdoch@dunc@n @end|ng |rom gm@||@com
Wed Jul 29 22:30:34 CEST 2020
On 29/07/2020 4:16 p.m., Sebastien Bihorel wrote:
> Thanks Duncan,
> (Sorry for the repeated email)
> People working in my field are frequently (and rightly) accused of
> butchering statistical terminology. So I guess I'm guilty as charged 😄
> I will look into the suggested path. One question though in your
> expression of loglik, p is "a + b*x/(c+x)". Correct?
p should be the probability of observing 1, so I would guess it is what
you calculated below, i.e. exp(likelihood)/(1 + exp(likelihood)) (in
your notation, not mine).
> *From:* Duncan Murdoch <murdoch.duncan using gmail.com>
> *Sent:* Wednesday, July 29, 2020 16:04
> *To:* Sebastien Bihorel <Sebastien.Bihorel using cognigencorp.com>; J C Nash
> <profjcnash using gmail.com>; r-help using r-project.org <r-help using r-project.org>
> *Subject:* Re: [R] Nonlinear logistic regression fitting
> Just a quick note about jargon: you are using the word "likelihood" in
> a way that I (and maybe some others) find confusing. (In fact, I think
> you used it two different ways, but maybe I'm just confused.) I would
> say that likelihood is the probability of observing the entire data set,
> considered as a function of the parameters. You appear to be using it
> (at first) as the probability that a particular observation is equal to
> 1, and then as the argument to a logit function to give that probability.
> What you probably want to do is find the parameters that maximize the
> likelihood (in my sense). The usual practice is to maximize the log of
> the likelihood; it tends to be easier to work with. In your notation
> below, the log likelihood would be
> loglik <- sum( resp*log(p) + (1-resp)*log1p(-p) )
> When you have a linear logistic regression model, this simplifies a bit,
> and there are fast algorithms that are usually stable to optimize it.
> With a nonlinear model, you lose some of that, and I'd suggest directly
> optimizing it.
> Duncan Murdoch
> On 29/07/2020 8:56 a.m., Sebastien Bihorel via R-help wrote:
>> Thank your, Pr. Nash, for your perspective on the issue.
>> Here is an example of binary data/response (resp) that were simulated and re-estimated assuming a non linear effect of the predictor (x) on the likelihood of response. For re-estimation, I have used gnlm::bnlr for the logistic regression. The accuracy of the parameter estimates is so-so, probably due to the low number of
> data points (8*nx). For illustration, I have also include a glm call to
> an incorrect linear model of x.
>> nx <- 10
>> x <- c(
>> rep(0, 3*nx),
>> rep(c(10, 30, 100, 500, 1000), each = nx)
>> rnd <- runif(length(x))
>> a <- log(0.2/(1-0.2))
>> b <- log(0.7/(1-0.7)) - a
>> c <- 30
>> likelihood <- a + b*x/(c+x)
>> p <- exp(likelihood) / (1 + exp(likelihood))
>> resp <- ifelse(rnd <= p, 1, 0)
>> df <- data.frame(
>> x = x,
>> resp = resp,
>> nresp = 1- resp
>> # glm can only assume linear effect of x, which is the wrong model
>> glm_mod <- glm(
>> data = df,
>> family = 'binomial'
>> # Using gnlm package, estimate a model model with just intercept, and a model with predictor effect
>> int_mod <- gnlm::bnlr( y = df[,2:3], link = 'logit', mu = ~ p_a, pmu = c(a) )
>> emax_mod <- gnlm::bnlr( y = df[,2:3], link = 'logit', mu = ~ p_a + p_b*x/(p_c+x), pmu = c(a, b, c) )
>> From: J C Nash <profjcnash using gmail.com>
>> Sent: Tuesday, July 28, 2020 14:16
>> To: Sebastien Bihorel <Sebastien.Bihorel using cognigencorp.com>; r-help using r-project.org <r-help using r-project.org>
>> Subject: Re: [R] Nonlinear logistic regression fitting
>> There is a large literature on nonlinear logistic models and similar
>> curves. Some of it is referenced in my 2014 book Nonlinear Parameter
>> Optimization Using R Tools, which mentions nlxb(), now part of the
>> nlsr package. If useful, I could put the Bibtex refs for that somewhere.
>> nls() is now getting long in the tooth. It has a lot of flexibility and
>> great functionality, but it did very poorly on the Hobbs problem that
>> rather forced me to develop the codes that are 3/5ths of optim() and
>> also led to nlsr etc. The Hobbs problem dated from 1974, and with only
>> 12 data points still defeats a majority of nonlinear fit programs.
>> nls() poops out because it has no LM stabilization and a rather weak
>> forward difference derivative approximation. nlsr tries to generate
>> analytic derivatives, which often help when things are very badly scaled.
>> Another posting suggests an example problem i.e., some data and a
>> model, though you also need the loss function (e.g., Max likelihood,
>> weights, etc.). Do post some data and functions so we can provide more
>> focussed advice.
>> On 2020-07-28 10:13 a.m., Sebastien Bihorel via R-help wrote:
>>> I need to fit a logistic regression model using a saturable Michaelis-Menten function of my predictor x. The likelihood could be expressed as:
>>> L = intercept + emax * x / (EC50+x)
>>> Which I guess could be expressed as the following R model
>>> ~ emax*x/(ec50+x)
>>> As far as I know (please, correct me if I am wrong), fitting such a model is to not doable with glm, since the function is not linear.
>>> A Stackoverflow post recommends the bnlr function from the gnlm (https://stackoverflow.com/questions/45362548/nonlinear-logistic-regression-package-in-r)...
> I would be grateful for any opinion on this package or for any
> alternative recommendation of package/function.
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> [[alternative HTML version deleted]]
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help