[R] Probit regression with limited parameter space
Ben Bolker
bbolker at gmail.com
Thu Feb 2 21:13:41 CET 2012
[cc'ing back to r-help again -- I do this so the answers can be
archived and viewed by others]
On 12-02-02 02:41 PM, Sally Luo wrote:
> Prof. Bolker,
>
> Thanks for your quick reply and detailed explanation.
>
> I also ran the unrestricted model using glmfit <-
> glm(y~x1+x2+x3+x4+x5+x6+x7+x8, family=binomial(link="probit"),data=d).
> However, the results I got from glm and mle2 (both for the unrestricted
> model) are not very similar (please see below). In your earlier example,
> both glm and mle2 produce almost the same estimation results. I just hope
> to figure out what might cause the discrepancy in the estimation results
> I've got.
>
>
> coef(summary(*glmfit*))
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.853900059 0.2464179864 -3.4652505 5.297377e-04
> x1 1.627125691 0.3076174699 5.2894450 1.226881e-07
> x2 -0.092716326 0.5229866504 -0.1772824 8.592866e-01
> x3 -3.301509522 0.9169991843 -3.6003407 3.178004e-04
> x4 7.187483436 2.2135961171 3.2469715 1.166401e-03
> x5 -0.002544181 0.0112740324 -0.2256673 8.214602e-01
> x6 6.978374268 2.2347939216 3.1226030 1.792594e-03
> x7 -0.009832379 0.0113807583 -0.8639476 3.876167e-01
> x8 -0.001252075 0.0002304789 -5.4324941 5.557178e-08
>
>> coef(summary(*mle2fit*))
> Estimate Std. Error z value Pr(z)
> pprobit.(Intercept) -0.603492668 0.230117071 -2.6225463 8.727541e-03
> pprobit.x1 1.645984346 0.288479906 5.7057158 1.158552e-08
> pprobit.x2 -0.157361533 0.523048376 -0.3008546 7.635253e-01
> pprobit.x3 -3.935203692 0.932692587 -4.2191862 2.451857e-05
> pprobit.x4 7.512701611 0.062911076 119.4177885 0.000000e+00
> pprobit.x5 -0.001475556 0.011525137 -0.1280293 8.981258e-01
> pprobit.x6 7.399355063 0.018372749 402.7353318 0.000000e+00
> pprobit.x7 -0.010113008 0.011647725 -0.8682388 3.852636e-01
> pprobit.x8 -0.001650021 0.000244997 -6.7348622 1.640854e-11
My best guess is that you are running into optimization problems.
The big advantage of glm() is that it uses a special-purpose
optimization method (iteratively reweighted least squares) that is
generally much more robust/reliable than general-purpose nonlinear
optimizers such as nlminb. If there is indeed a GLM fitting routine
coded in R, somewhere, that someone has adapted to work with box
constraints, it will probably perform better than mle2.
Some general suggestions for troubleshooting this:
* check the log-likelihoods returned by the two methods. If they are
very close (say within 0.01 likelihood units), then the issue is that
you just have a very flat goodness-of-fit surface, and the two sets of
coefficients are in practice very similar to each other.
* if possible, try starting each approach (glm(), mle2()) from the
solution found by the other (it's a little bit of a pain to get the
syntax right here) and see if they get stuck right where they are or
whether they find that one answer or the other is right.
* if you were using one of the optimizing methods from optim() (rather
than nlminb), e.g. L-BFGS-B, I would suggest you try using parscale to
rescale the parameters to have approximately equal magnitudes near the
solution. This apparently isn't possible with nlminb, but you could try
optimizer="optim" (the default), method="L-BFGS-B" and see how you do
(although L-BFGS-B is often a bit finicky). Alternatively, you can try
optimizer="optimx", in which case you have a larger variety of
unconstrained optimizers to choose from (you have to install the optimx
package and take a look at its documentation). Alternatively, you can
scale your input variables (e.g. use scale() on your input matrix to get
zero-centered, sd 1 variables), although you would then have to adjust
your lower and upper bounds accordingly.
* it's a bit more work, but you may be able to unpack this a bit and
provide analytical derivatives. That would help a lot.
In short: you are entering the quagmire of numerical optimization methods.
I have learned most of this stuff by trial and error -- can anyone on
the list suggest a good/friendly introduction? (Press et al Numerical
Recipes; Givens and Hoeting's Computational Statistics book looks good,
although I haven't read it ...)
Ben Bolker
>
>
>
> On Thu, Feb 2, 2012 at 1:12 PM, Ben Bolker <bbolker at gmail.com> wrote:
>
>>
>> [cc'ing back to r-help]
>>
>> On 12-02-02 01:56 PM, Sally Luo wrote:
>>
>>> I tried to adapt your code to my model and got the results as below. I
>>> don't know how to fix the warning messages. It says "rearrange the lower
>>> (or upper) bounds to match 'start'".
>>
>> The warning is overly conservative in this case. I should work on
>> engineering the package so that it handles this better. You can
>> disregard them.
>>
>> In answer to your previous questions:
>>
>> * "size" refers to the number of trials per observation (1, if you have
>> binary data)
>> * you've got the form of the lower and upper bounds right.
>> * you've got the formula in 'parameters' right -- this builds a linear
>> model (using R's model.matrix) on the probit scale based on the 8
>> parameters
>>>
>>> And two of the estimates for my restricted parameters are on the
>> boundary.
>>> The warning message says the variance-covariance calculations may be
>>> unreliable. Those parameters are the ones of interest to my study. Can
>> I
>>> still make inferences using the p-values reported by mle2 in this case?
>>
>> That's quite tricky unfortunately, and it isn't a problem that's
>> specific to the mle2 package. The basic issue is that the whole
>> derivation of the multivariate normal sampling distribution of the
>> maximum likelihood estimator depends on the maximum likelihood being an
>> interior local maximum (and hence having a negative-definite hessian, or
>> a positive-definite information matrix), which is untrue on the boundary
>> -- the Wikipedia article on maximum likelihood mentions this issue, for
>> example http://en.wikipedia.org/wiki/Maximum_likelihood
>>
>> Perhaps someone here can suggest an approach (although it gets outside
>> the scope of "R help", or you can ask on http://stats.stackexchange.com...
>>
>>
>>>
>>> Thanks for your help. ---- Sally
>>>
>>>
>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>> + parameters=list(pprobit~x1+x2+x3+x4+x5+x6+x7+x8),
>>> + start=list(pprobit=0),
>>> + optimizer="nlminb",
>>> + lower=c(-Inf,-1,-1,-1,-Inf,-Inf,-Inf,-Inf,-Inf),
>>> + upper=c(Inf,1,1,1,Inf,Inf,Inf,Inf,Inf),
>>> + data=d)
>>>
>>> Warning messages:
>>> 1: In fix_order(call$lower, "lower bounds", -Inf) :
>>> lower bounds not named: rearranging to match 'start'
>>> 2: In fix_order(call$upper, "upper bounds", Inf) :
>>> upper bounds not named: rearranging to match 'start'
>>> 3: In mle2(y ~ dbinom(pnorm(pprobit), size = 1), parameters =
>> list(pprobit
>>> ~ :
>>> some parameters are on the boundary: variance-covariance calculations
>>> based on Hessian may be unreliable
>>
>>>
>>>
>>> On Wed, Feb 1, 2012 at 11:16 PM, Sally Luo <shali623 at gmail.com> wrote:
>>>
>>>> Prof. Bolker,
>>>>
>>>> Thanks a lot for your reply.
>>>>
>>>> In my model, I have 9 explanatory variables and I need to restrict the
>>>> range of parameters 2-4 to (-1,1). I tried to modify the univariate
>> probit
>>>> example you gave in your reply, however, I could not get through.
>>>>
>>>> Specificially, I am not sure what 'pprobit' represents in your code? How
>>>> should I code this part if I have more than one variable?
>>>>
>>>> Also does "size" refer to the number of parameters?
>>>>
>>>> Since only 3 parameters need to be restricted in my model, should I
>> write
>>>> lower=c(-Inf, -1,-1,-1, -Inf, -Inf, -Inf, -Inf, -Inf) and upper=c(Inf,
>>>> 1,1,1, Inf, Inf, Inf, Inf, Inf)?
>>>>
>>>> Thanks again for your kind help.
>>>>
>>>> Best,
>>>>
>>>> Sally
>>>>
>>>>
>>>>
>>>> On Wed, Feb 1, 2012 at 7:19 AM, Ben Bolker <bbolker at gmail.com> wrote:
>>>>
>>>>> Sally Luo <shali623 <at> gmail.com> writes:
>>>>>
>>>>>>
>>>>>> Dear R helpers,
>>>>>>
>>>>>> I need to estimate a probit model with box constraints placed on
>>>>> several of
>>>>>> the model parameters. I have the following two questions:
>>>>>>
>>>>>> 1) How are the standard errors calclulated in glm
>>>>>> (family=binomial(link="probit")? I ran a typical probit model using
>> the
>>>>>> glm probit link and the nlminb function with my own coding of the
>>>>>> loglikehood, separately. As nlminb does not produce the hessian
>> matrix,
>>>>> I
>>>>>> used hessian (numDeriv) to calculate it. However, the standard errors
>>>>>> calculated using hessian function are quite different from the ones
>>>>>> generated by the glm function, although the parameter estimates are
>> very
>>>>>> close. I was wondering what makes this difference in the estmation of
>>>>>> standard errors and how this computation is carried out in glm.
>>>>>>
>>>>>> 2) Does any one know how to estimate a constrained probit model in R
>>>>> (to be
>>>>>> specific, I need to restrain the range of three parameters to [-1,1])?
>>>>>> Among the optimation functions, so far nlminb and spg work for my
>>>>> problem,
>>>>>> but neither produces a hessian matrix. As I mentioned above, if I use
>>>>>> hessian funciton and calculate standard errors manually, the standard
>>>>>> errors seem not right.
>>>>>>
>>>>>
>>>>> I'm a little biased, but I think the bbmle package is the
>>>>> easiest way to get this done -- it provides convenient wrappers
>>>>> for a range of optimizers including nlminb.
>>>>> I would warn however that you should be very careful interpreting
>>>>> the meaning of the Hessian matrix if some of your parameters lie
>>>>> on the boundary of the feasible space ...
>>>>>
>>>>> set.seed(101)
>>>>> x <- runif(100)
>>>>> p <- pnorm(1+3*x)
>>>>> y <- rbinom(100,p,size=1)
>>>>> d <- data.frame(x,y)
>>>>> glmfit <- glm(y~x,family=binomial(link="probit"),data=d)
>>>>> coef(summary(glmfit))
>>>>>
>>>>> library(bbmle)
>>>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>>> parameters=list(pprobit~x),
>>>>> start=list(pprobit=0),
>>>>> data=d)
>>>>> coef(summary(mle2fit))
>>>>>
>>>>> ## now add constraints
>>>>> mle2fit_constr <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>>> parameters=list(pprobit~x),
>>>>> start=list(pprobit=0),
>>>>> optimizer="nlminb",
>>>>> lower=c(2,0.15),
>>>>> data=d)
>>>>>
>>>>> coef(summary(mle2fit_constr))
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> <http://www.r-project.org/posting-guide.html>
>> >>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>
>>
>>
>
More information about the R-help
mailing list