[R-sig-ME] Correlated count data technique advice

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon Jan 9 10:07:21 CET 2012


Dear Lee,

A large numbers of zero do not imply zero-inflation. E.g.
> mean(rpois(10000, 0.01) == 0)
[1] 0.9902
This simulation has 99% zero's and is not zero-inflated.

Since you have a timeserie at only one location and one measurement per year there is no point in using a mixed model.

Wouldn't it be more relevant to look directly at the temperature than using a derived variable?

Best regards,

Thierry

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Lee Davis
Verzonden: vrijdag 6 januari 2012 19:54
Aan: r-sig-mixed-models at r-project.org
Onderwerp: [R-sig-ME] Correlated count data technique advice

Please excuse me for having posted a similar question on ecolog, but thus far I have received few useful answers there.

I am looking for some advice concerning techniques in R that are appropriate for correlated count data.

Specifically, I have some "freezing days" data, which is a count of the number of days each spring that were below freezing. The counts were taken at the same location over a period of years. The data set is highly zero inflated and over-dispersed; glm with a quasipoisson error structure would seem to be appropriate, except that there is a high degree of correlation at lags of 1 making something like a corAR1 structure appropriate. My difficulty is that glm() does not take an argument for correlation.

I could use  lmer() to fit a model like:

freezing days~years+(1|years), family=quasipoisson, correlation=corAR1

but lmer (and glmer) don't seem to be operating on quasi families anymore; I've found plenty of old posts here where lmer seems to have accepted quasi families in the past, but I get an error message that indicates lmer does not in fact accept quasi families.

I should note that I have run the following model:

 freeze.glmmPQL3<-glmmPQL(num.
freeze.days~years, random= ~1|years,
       family=quasipoisson,correlation=corAR1())

My gut says this is not the correct approach and I am unconvinced by the tiny p values that have been returned, especially as specification of poisson vs quasipoisson and the specification of corAR1() seem to make no difference to parameter estimation or p vals for said pars--it would seem that the random term for varying intercept by year is dominant. Maybe this is OK, but my above glm models return non-significant results and I expected handling the correlation to increase my p vals rather than decrease them. Perhaps an incorrect assumption.

Therefore I need some alternative to look at trends in this data over time that allows for quasipoisson error and something along the lines of a
corAR1() structure (or a mixed model that handles temporal pseudo-replication, but I am hesitant here).

Thank you in advance,
Lee

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list