[R] Probit regression with limited parameter space

Ben Bolker bbolker at gmail.com
Thu Feb 2 20:12:13 CET 2012

  [cc'ing back to r-help]

On 12-02-02 01:56 PM, Sally Luo wrote:

> I tried to adapt your code to my model and got the results as below.  I
> don't know how to fix the warning messages. It says "rearrange the lower
> (or upper) bounds to match 'start'".

  The warning is overly conservative in this case.  I should work on
engineering the package so that it handles this better. You can
disregard them.

  In answer to your previous questions:

 * "size" refers to the number of trials per observation (1, if you have
binary data)
 * you've got the form of the lower and upper bounds right.
 * you've got the formula in 'parameters' right -- this builds a linear
model (using R's model.matrix) on the probit scale based on the 8 parameters
> And two of the estimates for my restricted parameters are on the boundary.
> The warning message says the variance-covariance calculations may be
> unreliable.  Those parameters are the ones of interest to my study.  Can I
> still make inferences using the p-values reported by mle2 in this case?

  That's quite tricky unfortunately, and it isn't a problem that's
specific to the mle2 package.  The basic issue is that the whole
derivation of the multivariate normal sampling distribution of the
maximum likelihood estimator depends on the maximum likelihood being an
interior local maximum (and hence having a negative-definite hessian, or
a positive-definite information matrix), which is untrue on the boundary
-- the Wikipedia article on maximum likelihood mentions this issue, for
example http://en.wikipedia.org/wiki/Maximum_likelihood

  Perhaps someone here can suggest an approach (although it gets outside
the scope of "R help", or you can ask on http://stats.stackexchange.com ...

> Thanks for your help.  ---- Sally
> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
> +             parameters=list(pprobit~x1+x2+x3+x4+x5+x6+x7+x8),
> +             start=list(pprobit=0),
> +             optimizer="nlminb",
> +             lower=c(-Inf,-1,-1,-1,-Inf,-Inf,-Inf,-Inf,-Inf),
> +             upper=c(Inf,1,1,1,Inf,Inf,Inf,Inf,Inf),
> +             data=d)
> Warning messages:
> 1: In fix_order(call$lower, "lower bounds", -Inf) :
>   lower bounds not named: rearranging to match 'start'
> 2: In fix_order(call$upper, "upper bounds", Inf) :
>   upper bounds not named: rearranging to match 'start'
> 3: In mle2(y ~ dbinom(pnorm(pprobit), size = 1), parameters = list(pprobit
> ~  :
>   some parameters are on the boundary: variance-covariance calculations
> based on Hessian may be unreliable

> On Wed, Feb 1, 2012 at 11:16 PM, Sally Luo <shali623 at gmail.com> wrote:
>> Prof. Bolker,
>> Thanks a lot for your reply.
>> In my model, I have 9 explanatory variables and I need to restrict the
>> range of parameters 2-4 to (-1,1).  I tried to modify the univariate probit
>> example you gave in your reply, however, I could not get through.
>> Specificially, I am not sure what 'pprobit' represents in your code? How
>> should I code this part if I have more than one variable?
>> Also does "size" refer to the number of parameters?
>> Since only 3 parameters need to be restricted in my model, should I write
>> lower=c(-Inf, -1,-1,-1, -Inf, -Inf, -Inf, -Inf, -Inf) and upper=c(Inf,
>> 1,1,1, Inf, Inf, Inf, Inf, Inf)?
>> Thanks again for your kind help.
>> Best,
>> Sally
>> On Wed, Feb 1, 2012 at 7:19 AM, Ben Bolker <bbolker at gmail.com> wrote:
>>>  Sally Luo <shali623 <at> gmail.com> writes:
>>>> Dear R helpers,
>>>> I need to estimate a probit model with box constraints placed on
>>> several of
>>>> the model parameters.  I have the following two questions:
>>>> 1) How are the standard errors calclulated in glm
>>>> (family=binomial(link="probit")?  I ran a typical probit model using the
>>>> glm probit link and the nlminb function with my own coding of the
>>>> loglikehood, separately. As nlminb does not produce the hessian matrix,
>>> I
>>>> used hessian (numDeriv) to calculate it.  However, the standard errors
>>>> calculated using hessian function are quite different from the ones
>>>> generated by the glm function, although the parameter estimates are very
>>>> close.  I was wondering what makes this difference in the estmation of
>>>> standard errors and how this computation is carried out in glm.
>>>> 2) Does any one know how to estimate a constrained probit model in R
>>> (to be
>>>> specific, I need to restrain the range of three parameters to [-1,1])?
>>>> Among the optimation functions, so far nlminb and spg work for my
>>> problem,
>>>> but neither produces a hessian matrix.  As I mentioned above, if I use
>>>> hessian funciton and calculate standard errors manually, the standard
>>>> errors seem not right.
>>>   I'm a little biased, but I think the bbmle package is the
>>> easiest way to get this done -- it provides convenient wrappers
>>> for a range of optimizers including nlminb.
>>>   I would warn however that you should be very careful interpreting
>>> the meaning of the Hessian matrix if some of your parameters lie
>>> on the boundary of the feasible space ...
>>> set.seed(101)
>>> x <- runif(100)
>>> p <- pnorm(1+3*x)
>>> y <- rbinom(100,p,size=1)
>>> d <- data.frame(x,y)
>>> glmfit <- glm(y~x,family=binomial(link="probit"),data=d)
>>> coef(summary(glmfit))
>>> library(bbmle)
>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>             parameters=list(pprobit~x),
>>>             start=list(pprobit=0),
>>>             data=d)
>>> coef(summary(mle2fit))
>>> ## now add constraints
>>> mle2fit_constr <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>             parameters=list(pprobit~x),
>>>             start=list(pprobit=0),
>>>             optimizer="nlminb",
>>>             lower=c(2,0.15),
>>>             data=d)
>>> coef(summary(mle2fit_constr))
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list