[R-sig-eco] Logistic regression with repeated measures ?

Wed Nov 27 23:29:07 CET 2013

Hi Marieline,

I would consider using the raw data so it is binary, and use a random
intercept to account for the different sampling intensities for each bird.

If U have summarised the data so it is 1 score for each bird which is a %
I can’t see how U can account for repeated measures since U don’t have the
correlation structure in your data.

It's also hard to tell if you will likely need to account for
pseudoreplication caused by serial correlation i.e. repeated measure since
I can’t see any indication of the fix rate. However if you plot the
residuals this may help, that said residuals from logistic regressions can
be hard to interpret so some type of lagged correlation plot would likely
be better.

Chris Howden B.Sc. (Hons) GStat.
Founding Partner
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(skype) chris.howden
chris at trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are
not the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent.
Although this email has been checked by anti-virus software, there is a
risk that email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and
analyses based on the data supplied, however it is possible that some
important predictors were not included in the data sent to us. Information
provided by us should not be solely relied upon when making decisions and
clients should use their own judgement.

-----Original Message-----
From: r-sig-ecology-bounces at r-project.org
[mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of marieline gentes
Sent: Thursday, 28 November 2013 8:30 AM
To: r-sig-ecology at r-project.org; moi
Subject: [R-sig-eco] Logistic regression with repeated measures ?

Dear list,

I am a bit new to logistic regressions. I am working with GPS data from
GPS-tracked birds. My objective is to investigate whether various
covariates influence the probabilty of visiting specific habitats. Each
bird has visited many habitats during the course of its GPS tracking.

Here is a small sample of the data:

Bird.ID Year Sex body.index Recapt PrevWeek.Rain AgriYes AgriNo UrbanYes
UrbanNo CAL 2010 M 21.99155 13-May-10 1.43 0 100 0 100 CAO 2011 F
-19.91797 27-Apr-11 4.23 54 46 9 91 CFL 2010 F 25.61063 12-May-10 2.16 31
69 2 98 CFP 2010 M -30.65814 13-May-10 1.43 60 40 0 100

I understand that I have to use logistic regression, with a cbind code,
because my response variable is not binary anymore (the response is a
summary of the success vs failures).

Based on R tutorials, I am thinking about codes that would look like this:

Agri.RainSex = glm(cbind(AgriYes, AgriNo) ~ PrevWeekRain + Sex + Year +
Year*Sex,family=binomial (logit), data=mydata) However, contrary to the
examples I see online, my data are from individual birds, not from groups
of birds. If I had been using the raw binary data, each bird would have
100 hundred lines (I converted the percentages into success/failures)(all
my % are weighted the same - that is not a problem here). Am I supposed to
take into account some kind of repeated measure in my model ?

Notes:
For people who are thinking about overdispersed data: my data does not
seem to be overdispersed. But I will inspect that after I am confident
that my basic model is ok. So this question is about dealing with repeated
measured, not about adding a random intercept for overdispersion.

For people who are working with habitat selection models: this is not the
case here. We are not working on resource selection. We want to fit a
simple logistic regression on this data as a part of data exploration.
This ultimate goal is to link contaminant burden with the proportion of
time spent in a given habitat.

Thank you for your advice,

Marie
	[[alternative HTML version deleted]]