[R] Time series count model?

Thomas Lumley tlumley at u.washington.edu
Tue Nov 20 17:35:58 CET 2001


On Tue, 20 Nov 2001 pauljohn at ukans.edu wrote:

> I am trying to counsel a student who has count data (with many
> 0's and small nuumbers) that is a time series.  He was fitting
> linear models to this data, but the count nature of the data
> causes me a lot of concern, and I am looking around for a time
> series approach to a model which allows all the bells and
> whistles of count models. By chance, I notice that Congdon's
> WinBUGS examples have an example that has a count model with
> time series issues, but I fear it might be too great of a leap
> for us to justify a paradigm switch to Bayesian statistics in
> order to fit this one model.
>
> If you were using R, how would you get a foot hold on this
> problem?
>

<snip details of model>
>
> We have a copy of Stata sitting around here and the students
> have found in there a canned procedure to estimate that model,
> it does various specificiation tests, such as a test for whether
> the zero inflation part is necessary.  But that does not attend
> to the time-series part. I don't do Stata myself, I am learning
> R, and would like to see if I can do this in R.
>
> We need to incorporate the possibility of an "error term" that
> is influenced by its own lagged values in the usual ARIMA
> sense.  Looking back to the justification for the Negative
> Binomial in the first place, I remember one justification for
> the NB model was a Poisson with heterogeneity.
>
> p(y | X,b) = Pois (exp(Xb + e))
>
> I don't think I ever understood very well why this leads to a NB
> model. Maybe that's where I need to study.

It only does if exp(e) is Gamma distributed.  If e is Normal it leads to a
Poisson-Normal mixture. They're going to be almost indistinguishable in
practice.


> Nevertheless, where can I go if I start with that theory, but
> the e are not independent, say they are MA(1)
>
> e_t = g*e_{t-1} + u_t
>
> and u_t is Normal(0,sigma^2).
>
> Should I just write out a big log likelihood function and use
> R's optim to fit it?
>
> It seems like I'm missing out on something by going that route,
> though.

Some other things to consider, in addition to maximum likelihood

- Zeger (Biometrika 1988, pp621-629) gives an estimation procedure for a
time series count model that is based on a Poisson-Normal mixture.

- Something more or less similar is discussed by Davis et al (Biometrika
2000, p491-505)

- There are standard methods for standard error estimation in time series
regression problems, based on the same principles as the sandwich
estimators used for longitudinal data.  Some of these (eg the Newey-West
estimator) have been used in econometrics for a long time. Although they
are most often used for continuous response variables they work perfectly
well for counts.  Stata 7.0 does the Newey-West estimator for generalised
linear models, which may be what you had heard about.  These and related
sandwich-type estimators are reviewed by Lumley & Heagerty (JRSSB
1999,pp459-477).  I have R code, but again only for generalised linear
models, not for zero-inflation models.

- The Poisson-Normalmodel leads to a loglinear marginal model that can be
fitted by glm(), and I would expect something similar  to be true of
the zero-inflation model. This means that you may be able to just estimate
a marginal model (unless you are actually interested in inference about
the correlation structure).  In principle this could be inefficient, but
for very discrete data there isn't much information in the
autocorrelation.

- The full likelihood is intractable anyway -- it doesn't factorise the
way a Gaussian AR-1 does.  That's one reason Bayesians like these models:
MCMC is the easy way out computationally (though still not trivial).
There's a fairly popular approximate maximum likelihood method called PQL
that works reasonably well except in binary and small count data.


	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list