[R] Estimated Standard Error for Theta in zeroinfl()

Tue Feb 16 15:13:07 CET 2010

Dear Dr. Turner,

Thank you very much for taking the time to answer my request. The suggestion that you have provided did ring a bell for me. So, I went digging a bit and found the following article that I have read a while ago:

Baskerville, G.L. 1972. Use of logarithmic regression in the estimation of plant biomass. Canadian Journal of Forest Research 2, 49-53.

Within Baskerville's article (pg. 51), there's equations for adjusting the bias to obtain estimated mean and variance (SE.theta) of Theta; the equations are exactly the same as you have provided in the following.

You are also right in that the assumption is that log(theta) is normally distributed. 

Thank you very much for the advices that you have provided.

Best regards,
Tzeng Yih Lam

------------------------------------------------------------------------------
PhD Candidate
Department of Forest Engineering, Resources and Management
College of Forestry
Oregon State University
321 Richardson Hall
Corvallis OR 97330 USA
Phone: +1.541.713.7504
Fax: +1.541.713.7504
------------------------------------------------------------------------------
________________________________________
From: Rolf Turner [r.turner at auckland.ac.nz]
Sent: Sunday, February 14, 2010 4:53 PM
To: Lam, Tzeng Yih
Cc: r-help at r-project.org
Subject: Re: [R] Estimated Standard Error for Theta in zeroinfl()

On 15/02/2010, at 12:49 PM, Lam, Tzeng Yih wrote:

> Dear R Users,
>
> When using zeroinfl() function to fit a Zero-Inflated Negative Binomial (ZINB) model to a dataset, the summary() gives an estimate of log(theta) and its standard error, z-value and Pr(>|z|) for the count component. Additionally, it also provided an estimate of Theta, which I believe is the exp(estimate of log(theta)).
>
> However, if I would like to have an standard error of Theta itself (not the SE.logtheta), how would I obtain or calculate that standard error?

It's tricky.  The exp and log functions aren't linear!!!
So nothing really works very well.

To start with, if you have an unbiased estimate of log(theta),
say lt.hat, then exp(lt.hat) is NOT an unbiased estimate of theta.

If you are willing to assume that lt.hat is normally distributed
then the expected value of exp(lt.hat) is theta*exp(sigma^2/2)
where sigma^2 is the variance of lt.hat.

The variance of exp(lt.hat) is

        (*) theta^2 * exp(sigma^2) * (exp(sigma^2) - 1).

You can ``plug in'' the SE of lt.hat for sigma into the foregoing and get
an ***``approximately''*** unbiased estimate of theta:

        theta.hat = exp(lt.hat - SE^2/2)

and then an approximate estimate of the variance of this ``theta.hat''
(by plugging in theta.hat for theta and SE for sigma in (*)).  The results
won't be correct, but they'll probably be in the right ball park.  I think!

This is all posited on the distribution of the estimate of log(theta) being
normal (or ``Gaussian'').  Whether this is a justifiable assumption in your
setting is questionable.

Some simulation experiments might be illuminating.

        cheers,

                Rolf Turner

P. S.  The formulae I gave about are the result of quickly scribbled
calculations, and could be wrong.  They should be checked.

                R. T.
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}