[R] function censReg in panel data setting

Arne Henningsen arne.henningsen at googlemail.com
Wed Sep 14 09:40:17 CEST 2011


On 14 September 2011 00:36, Arne Henningsen
<arne.henningsen at googlemail.com> wrote:
> Hi Igors
>
> On 13 September 2011 13:27, Igors <igors.lahanciks at gmail.com> wrote:
>> Any success in finding possible solutions for my problem?
>
> Somewhat. The calculation of the log-likelihood values is numerically
> much more robust/stable now. The log-likelihood contributions of some
> individuals became minus infinity in your model. This was caused by
> rounding errors as illustrated in the following simplified example:
>
> log( exp( a ) + exp( b ) )
>
> If a and b become smaller than approximately -800, exp( a ) and exp( b
> ) are rounded to zero and the log of their sum (zero) is minus
> infinity.
> I have solved this problem by replacing the above calculation by
>
> log( exp( a - c ) + exp( b - c ) ) + c
> with c = max( a, b )
>
> The source code of the improved censReg package is available on
> R-Forge [1]; R packages will be available on R-Forge [2] probably
> within one day.
>
> [1] https://r-forge.r-project.org/scm/?group_id=256
> [2] https://r-forge.r-project.org/R/?group_id=256
>
> Unfortunately, the calculation of the gradients is still not robust
> but I expect that I can solve this problem in a similar way as I used
> to solve the problem with the likelihood function itself. I will
> continue working on this.
>
>> I have tried to experiment with size of sample and I get really bad picture.
>> I can't get it work even if sample is ~ 1000 obs. And it is way less than I
>> would like to see working, taking into account my full sample size ~ 540 000
>> obs.
>
> I hope that you have a very fast computer -- or a lot of time for
> waiting many days or even a few weeks.

Now I have also improved the numerical stability of the calculation of
the *gradients* of the log-likelihood function. Basically, the problem
was that in an equation similar to

( exp(a1) * b1 + exp(a2) * b2 ) / exp(d)

a1, a2, and d could have values less than -800 so that exp(a1),
exp(a2), and exp(d) became zero and hence, the entire equation became
zero divided by zero ("NaN") in specific cases. As a1, a2, and d are
usually of similar size, I could solve this problem by re-arranging
the above equation to

exp(a1-d) * b1 + exp(a2-d) * b2

I hope that the numerical stability is sufficiently large now so that
you can estimate your large model. Please let me know if it works now.

Again, the source code of the improved censReg package is available on
R-Forge [1]; R packages will be available on R-Forge [2] probably
within one day. Please note that you need at least revision 1207.

[1] https://r-forge.r-project.org/scm/?group_id=256
[2] https://r-forge.r-project.org/R/?group_id=256

Best regards,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name



More information about the R-help mailing list