[R-sig-ME] A modeling question for the ecologists

Jeremy Koster helixed2 at yahoo.com
Fri Sep 21 04:51:20 CEST 2012


Thanks, Jim.  There are definitely some limitations to the dataset (which was collected for other purposes).  For example, we have data only from successful hunts, so we can't really estimate how much time dogs spend on nocturnal versus diurnal hunts (speaking to your first concern).  A correlation between the use of dogs and the activity patterns of prey species (e.g., nocturnal) wouldn't necessarily mean a whole lot if dogs were used almost exclusively at night -- hence our efforts to include the time of the harvest as a crude indicator along these lines.  And yes, a simple binary variable to denote the presence of dogs overlooks variation in their abilities (a favorite topic of mine).

To rephrase, we're basically interested in testing for a correlation between the use of dogs and the relative preponderance of nocturnal species in the harvest (as in your first point).  Because we have contextual data for each harvest -- specifically the time at which the animal was harvested -- I was reluctant to aggregate the data, which steered me toward a logistic regression and the need to account for the repeated harvests of the same species via a mixed model.  But since these are all categorical predictors, it occurs to me that there might be better alternatives . . .

In any case, a colleague just suggested this model:

model <- glmer ( Activity ~ Dogs * Time + (1|Species) , family = binomial, data = d)


But that one just seems peculiar . . . albeit for reasons that I can't fully articulate.



----- Original Message -----
From: "Baldwin, Jim -FS" <jbaldwin at fs.fed.us>
To: Jeremy Koster <helixed2 at yahoo.com>; "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>
Cc: 
Sent: Thursday, September 20, 2012 10:23 PM
Subject: RE: [R-sig-ME] A modeling question for the ecologists

I'm not in Ben's league (either as an ecologist or as a statistician) and proof of that might be with the following questions/comments:

1.  By having presence of dogs as the dependent variable it seems that your question (potentially) addressed by the model is "Given that we have a capture from a nocturnal species, what proportion involved dogs?"  But that would be influenced with how often dogs were used (which has nothing to do with how well dogs can detect certain species).

2.  Activity is a known and singular quantity for each species (or at least there is just a single assessment even if it might be wrong).  Wouldn't the capture rate by dogs depend on the harvested species?  I'm not understanding what an overall effect of "activity" might mean given the different numbers of each species available to be captured and how well a dog detects any particular species.

3.  It would seem that the units of replication would have to acknowledge multiple harvests (and I'm assuming that there are multiple harvests some with dogs and some without) and different areas harvested rather than the individual captures.  From your description I assume that "Dogs" are applied to individual "harvests".  (Not to mention the use of different dogs or numbers of dogs.)

Jim Baldwin


-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Jeremy Koster
Sent: Thursday, September 20, 2012 6:27 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] A modeling question for the ecologists

This message is directly especially toward the ecologists like Ben Bolker, but I'd welcome any advice . . .

We have data on the harvests of wildlife.  For approximately 5,000 harvested animals (and thus 5,000 rows in the data frame), we have these variables:
1. "Species" (about 20 in all)
2. "Activity" (whether the harvested species is predominantly diurnal or nocturnal) 3. "Time" (the time of day at which the animal was harvested 4. "Dogs" (whether or not dogs were present and presumably assisting when the animal was captured)

The working hypothesis is that dogs increase harvests of nocturnal species because they can sniff them out during the day when they're sleeping and track them at night when vision is limited (almost all nocturnal hunting involves dogs in this setting).

So initially I was inclined to specify a model:

model <- glmer ( Dogs ~ Activity * Time + (1|Species) , family = binomial, data = d)

That was partly because "Activity" is essentially a species-level variable, so it felt appropriate to include it as a predictor with a random effect for "Species."

Intuitively, though, we tend to think of "Activity" as the outcome variable and the presence of dogs as a predictor, which raises or lowers the preponderance of nocturnal species in the harvest.

Is there a good way to model these data while retaining that sense of causality -- in other words, could we put Activity on the left side of the equation?


_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models





This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.



More information about the R-sig-mixed-models mailing list