[R-sig-ME] model specification for continuous environmental vars

Fri Dec 13 14:29:39 CET 2013

Hans,
Thank you for your reply. Some comments inserted ...

>From: Hans Ekbrand <hans.ekbrand at gmail.com>
>>On Thu, Dec 12, 2013 at 01:39:04PM -0500, Tim Howard wrote:
>> List members - 
>> I am learning a lot, quickly, but still have a way to go. I would 
>> greatly appreciate some help with model specification in glmer.
>> I can't find a good example that parallels what I've got.
>> 
>> My dataset consists of spatially-balanced random samples of rare 
>> plants within alpine summits. There were two sampling bouts (yr1 and yr2)
>> with yr2 collected 6 years after yr1. A new set of random plots were 
>> collected at each bout (e.g. new estimate of the population, not repeated 
>> measures). I would like to test the difference in plant density from yr1 to yr2, overall. 
>>  
>> These are count data with many zeros, fitting a negative binomial distribution.
>>  
>> This is what confuses me:  I ALSO have environmental information 
>> that influences density, such as elevation, solar radiation, slope (and more)
>> I would like to include these variables in the model, but I am not 
>> exactly sure how. Based on my reading, this is what I think I have:
>>  
>> block random effects: summit
>> continuous random effects: elev, solrad, slope
>> fixed effect: time (=samp)
>> individual random effects to deal with overdisperson: plotID
>>  
>> I have over 350 plots for each sample bout, spread among 17 summits.
>>  
>> Given this I think my model is:
>>  
>> mod <- glmer(count ~ samp + (1|summit) + (1|elev) + (1|solrad) + (1|slope) + (1|pltID), 
>>        data=dat, family="poisson")
>>  
>> My primary questions: 
>> Is this the appropriate way to handle these environmental variables?
>
>From my limited understanding of these issues, I'd say elev, solrad
>and slope should be fixed effects - the are universal in the sense
>that they are defined for every (imaginable) case.
Ah!  I had been understanding that, since any new point would also most likely 
have a *different* value for these variables, then my samples did not define all the 
possibilities and thus needed to be random effects. 

>
>If they vary within each summit, you could - in addition to having
>them as fixed terms - also include them as random slope terms:
>(elev|summit) + (solrad|summit) + (slope|summit), if you want to
>explain (some of) the variance between summits.
>
>Since you do not have repeated measures of pltID, you can not include
>a random term for pltID - there is no variance within each value of
>pltID.

Here, I was following earlier discussions about how to deal with overdispersion, such as this one:

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015867.html

A key problem is that I'd rather be using a negative binomial, rather than poisson, to fit these data.

>
>I would model like this 
>
>mod <- glmer(count ~ samp + (1|summit) + elev + solrad + slope, data=dat, family="poisson")
>
>And possibly try (but I don't think you have enough data for this)
>
>mod <- glmer(count ~ samp + (1|summit) + elev + solrad + slope + (elev|summit) + (solrad|summit) + (slope|summit), data=dat, family="poisson")
>
Thank you! I'll work through these.
Tim