[R] Estimated Standard Error for Theta in zeroinfl()

Wed Feb 24 19:09:37 CET 2010

Dear Dr. Zeileis,

You are right, the setup of Baskerville is different than the ZINB model. In Baskerville, it was concerned with the transformation and back-transformation on the response variable. 

Thank you for taking the time to make the issue clearer again as well as the advices on the significant test to use.

Best regards,
Tzeng Yih Lam

------------------------------------------------------------------------------
PhD Candidate
Department of Forest Engineering, Resources and Management
College of Forestry
Oregon State University
321 Richardson Hall
Corvallis OR 97330 USA
Phone: +1.541.713.7504
Fax: +1.541.713.7504
------------------------------------------------------------------------------
________________________________________
From: Achim Zeileis [Achim.Zeileis at uibk.ac.at]
Sent: Tuesday, February 16, 2010 7:45 AM
To: Lam, Tzeng Yih
Cc: Rolf Turner; r-help at r-project.org
Subject: Re: [R] Estimated Standard Error for Theta in zeroinfl()

On Tue, 16 Feb 2010, Lam, Tzeng Yih wrote:

> Dear Dr. Turner,
>
> Thank you very much for taking the time to answer my request. The suggestion that you have provided did ring a bell for me. So, I went digging a bit and found the following article that I have read a while ago:
>
> Baskerville, G.L. 1972. Use of logarithmic regression in the estimation of plant biomass. Canadian Journal of Forest Research 2, 49-53.
>
> Within Baskerville's article (pg. 51), there's equations for adjusting the bias to obtain estimated mean and variance (SE.theta) of Theta; the equations are exactly the same as you have provided in the following.
>
> You are also right in that the assumption is that log(theta) is normally
> distributed.

In the setup of Baskerville that is. In a ZINB model, I wouldn't be aware
of results about exact normality. Depending on whether you define
the log-likelihood for the ZINB model in terms of theta or log(theta), you
get an asymptotic normal distribution for theta or log(theta) by means of
the standard central limit theorem for maximum likelihood estimation.

(As pointed out in my previous mail, this does not say anything about
bias or unbiasedness. Both estimates are consistent, though.)

If you use the the likelihood ratio test for inference about hypotheses
with respect to theta or log(theta), both will yield equivalent results.
However, the Wald test (as used in summary() output) is not invariant to
non-linear transformations of the hypothesis.

Best,
Z

> Thank you very much for the advices that you have provided.
>
> Best regards,
> Tzeng Yih Lam
>
> ------------------------------------------------------------------------------
> PhD Candidate
> Department of Forest Engineering, Resources and Management
> College of Forestry
> Oregon State University
> 321 Richardson Hall
> Corvallis OR 97330 USA
> Phone: +1.541.713.7504
> Fax: +1.541.713.7504
> ------------------------------------------------------------------------------
> ________________________________________
> From: Rolf Turner [r.turner at auckland.ac.nz]
> Sent: Sunday, February 14, 2010 4:53 PM
> To: Lam, Tzeng Yih
> Cc: r-help at r-project.org
> Subject: Re: [R] Estimated Standard Error for Theta in zeroinfl()
>
> On 15/02/2010, at 12:49 PM, Lam, Tzeng Yih wrote:
>
>> Dear R Users,
>>
>> When using zeroinfl() function to fit a Zero-Inflated Negative Binomial (ZINB) model to a dataset, the summary() gives an estimate of log(theta) and its standard error, z-value and Pr(>|z|) for the count component. Additionally, it also provided an estimate of Theta, which I believe is the exp(estimate of log(theta)).
>>
>> However, if I would like to have an standard error of Theta itself (not the SE.logtheta), how would I obtain or calculate that standard error?
>
>
> It's tricky.  The exp and log functions aren't linear!!!
> So nothing really works very well.
>
> To start with, if you have an unbiased estimate of log(theta),
> say lt.hat, then exp(lt.hat) is NOT an unbiased estimate of theta.
>
> If you are willing to assume that lt.hat is normally distributed
> then the expected value of exp(lt.hat) is theta*exp(sigma^2/2)
> where sigma^2 is the variance of lt.hat.
>
> The variance of exp(lt.hat) is
>
>        (*) theta^2 * exp(sigma^2) * (exp(sigma^2) - 1).
>
> You can ``plug in'' the SE of lt.hat for sigma into the foregoing and get
> an ***``approximately''*** unbiased estimate of theta:
>
>        theta.hat = exp(lt.hat - SE^2/2)
>
> and then an approximate estimate of the variance of this ``theta.hat''
> (by plugging in theta.hat for theta and SE for sigma in (*)).  The results
> won't be correct, but they'll probably be in the right ball park.  I think!
>
> This is all posited on the distribution of the estimate of log(theta) being
> normal (or ``Gaussian'').  Whether this is a justifiable assumption in your
> setting is questionable.
>
> Some simulation experiments might be illuminating.
>
>        cheers,
>
>                Rolf Turner
>
> P. S.  The formulae I gave about are the result of quickly scribbled
> calculations, and could be wrong.  They should be checked.
>
>                R. T.
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>