[R-sig-eco] First mixed effects model - any comments are highly appreciated

Tue Mar 30 22:33:43 CEST 2010

Hi all.

After plowing through quite a few chapters and several web-pages on the
subject, I have just made my first attempt at mixed effect modeling on
one of my own datasets (using lme from the nlme-package). The primary
reason for choosing a mixed effect model is that the data are
longitudinal with repeated measures on the same individuals. Being a
phd-student with limited statistical background (I am a biologist), I
would really appreciate it, if any of you would comment on this. 

The dataset arises from automated telemetric monitoring of a number of
fish tagged with acoustic transmitters and basically consists of a
measure of activity (moved dist per day) for each fish. During the study
period we have 'recorded' four cannibalistic events where one of the
tagged fish eats another tagged fish. The change in activity of these
cannibals (activity before vs. activity after the cannibalistic event)
compared to the remaining fish is the primary interest in this analysis.
The hypothesis is that the cannibal in each event expressed reduced
activity in the days following the cannibalistic event, while the
remaining fish showed no changes. For each fish the activity for five
days before and five days after each event are included. Note that the
number of fish in each event differs (range 12 - 20).

The data structure looks like this:

Event - Before_After - Cannibal - Dist/Day - Fish

where Event(1,2,3,4) = the cannibalistic event, Before_After(-1,1) =
describes whether a row represents a value before or after the
cannibalistic event, Cannibal(0,1) = whether or not the fish is the
cannibal in that event, Dist/Day(numerical) = the measured activity,
Fish(factor representing each fish).

To fit a model to this dataset I followed the ten-step protocol given in
Alain F. Zuur et al (2009): "Mixed Effects Models and Extensions in
Ecology with R". I started out with the three-way interaction
Before_After*Cannibal*Event and after model-reduction based on step-wise
term deletion ended up with the following model:

M.Final <- lme(Dist/Day ~ Before_After * Cannibal, random = ~ 1 |
Fish/Event, correlation=corAR1(form=~1|Fish/Event))  

The summary(M.Final) shows that the interaction term is highly
significant (p=0.0001). The various model validation plots looks quite
okay.

To the best of my knowledge, M.Final models the activity (Distance per
day) explained by the Before_After*Cannibal interaction as well as the
main effects. The random term (1 | Fish/Event) allows a different
intercept for each individual fish in each event, while the correlation
structure takes care of the fact that the activity on consecutive  days
were highly correlated. The fact that the interaction term is
significant shows that the cannibals and the remaining fish indeed
responded differently to the cannibalistic events. A graphical
representation of this clearly shows that the cannibals indeed had lower
activity after the events compared to before while the remaining fish
did not show any changes in activity.

The above paragraph is what I believe the model is doing, so please do
correct me, if I am wrong!

Being a biologist a few question arises after this, e.g.
Is it (at least somewhat) correct to do it as described above? 
Could the final model be improved to better fit the biological reality? 
Is my interpretation of the output correct? 
Can I somehow get more numerical information about the interaction term?
At present, I rely on the graphs to see the 'direction' of the
interaction.

So if any of you could spare a few minutes to share your thoughts on the
above, I would be VERY happy!

All the best, 
Henrik B