[R-sig-ME] Use of offset variable when analysing rates in lme4

Wed Apr 9 09:20:59 CEST 2014

Dear Caroli and Thierry,

I don't have any objections to what Thierry said regarding the offset in a Poisson model, but I am wondering if a discrete binomial model might be more appropriate than a Poisson. You said you counted the number of visits per inflorescence. Does that mean you recorded whether or not there was a pollinator present in a flower before moving on to the next inflorescence, and that there could be at most one visitor per flower? In other words if you counted say 200 inflorescences, you could record between 0-200 visits? If so, is that not a case of a binomial distribution? 

Stroup (2014) doi:10.2134/agronj2013.0342 , which was recently discussed on the list, has worked examples of a discrete binomial model if you are interested.

I'd welcome any thoughts on this since I'm about to do a similar analysis. Often in these types of data, the number of visits can be quite low (e.g. 0-20) even when you observe several hundred inflorescences. Does anyone have input on choosing between Poisson vs discrete binomial in similar cases?

Regards,

Jens Astrom 

-----------------------------------------------------
Message: 1
Date: Thu, 3 Apr 2014 12:22:56 +0000
From: "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>
To: "De Waal, C, Mej <caroli at sun.ac.za>" <caroli at sun.ac.za>,
	"r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Use of offset variable when analysing rates in
	lme4
Message-ID:
	<AA818EAD2576BC488B4F623941DA7427F3A21EB3 at inbomail.inbo.be>
Content-Type: text/plain; charset="iso-2022-jp"

Dear Caroli,

Your model is
visits ~ Pois(lamba)
log(lambda) = mu = beta_i * treat + beta_j * timecat + log(infl)

which can be rewritten to

log(lambda) - log(infl) = mu - log(infl) = beta_i * treat + beta_j * timecat
log(lambda) - log(infl) = log(lambda / infl) = log(visitation rate)

hence beta_i and beta_j are effects in log(visitation rate). Exponentiating the results from glht should give you the relative effects on the visitation rate.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] Namens De Waal, C, Mej <caroli at sun.ac.za>
Verzonden: woensdag 2 april 2014 11:50
Aan: r-sig-mixed-models at r-project.org
Onderwerp: [R-sig-ME] Use of offset variable when analysing rates in lme4

Dear all,

This is my first post to the list as I have been unsuccessful in getting an answer elsewhere. I am also quite new to R.

I am investigating variation in pollinator visitation rate (number of visits per inflorescence) with treatment and time category as fixed factors. Block is a random factor. Following Zuur et al (2009), I used the number of visits as response variable with the log(number of inflorescences) as offset variable. A poisson model was overdispersed, and therefore I opted for a negative binomial model in lme4, as follows:

model1 = glmer.nb(visits ~ treat + timecat + offset(log(infl)) + (1|block))

I am specifically interested in differences in visitation rates between treatments. I therefore performed a post hoc test:

OPexp1 =  glht(model1,mcp(treat = "Tukey"))
plot(cld(OPexp1))

When I plot these results, I get number of visits on the y axis. But what I want is visitation rate (visits per inflorescence). How do I specifiy that the post hoc test should be performed using visitation rate?

I assume what is happening is that fitted values are currently expressed as ? ? V, but how to I specify that they should be expressed as ? (visits per inflorescence) only? On p 240, Zuur et al (2009) mentions that this is possible, but I have not been able to find an example.

Any advice would be much appreciated.

Kind regards,

Caroli de Waal

PhD Student
University of Stellenbosch, South Africa The integrity and confidentiality of this email is governed by these terms / Hierdie terme bepaal die integriteit en vertroulikheid van hierdie epos. http://www.sun.ac.za/emaildisclaimer

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.