[R-sig-ME] Model assessment strategy for binomial mixed model

Adam Smith raptorbio at hotmail.com
Thu Oct 27 18:24:52 CEST 2011


All,

A couple of weeks ago, I solicited assistance with a binomial mixed model with nested random effects.  After much research and guidance from this list, I've formulated an analysis plan that I'd like to run by anyone who would graciously take time to evaluate the process.

The data structure:

> str(abs2009)
'data.frame':   416 obs. of  8 variables:
 $ obs  : Factor w/ 416 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ plot : Factor w/ 16 levels "N_10","N_2","N_3",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ geog : Factor w/ 2 levels "N","S": 1 1 1 1 1 1 1 1 1 1 ...
 $ trt  : Factor w/ 2 levels "CONT","TRT": 1 1 1 1 1 1 1 1 1 1 ...
 $ count: Factor w/ 14 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ total: int  143 139 137 135 131 127 123 115 112 109 ...
 $ abs  : int  4 2 2 4 4 4 8 3 3 18 ...
 $ dT   : int  4 3 3 3 3 3 3 3 3 3 ...

The number of fruit abscised (abs) relative to the total number of fruit (total) is to be evaluated as a function of treatment (trt), geography (geog), and over time (count); abscission is highly stochastic, thus the use of a factor for the count variable.  dT is the length, in days, of the count period during which abscission occurred.  There are 16 plots, each composed of two subplots receiving a treatment or control at random.


Step 1: Evaluate the need to account for overdispersion using individual level random effects.  

I used the full model with all 2-way interactions and a LRT for the overdispersion parameter:

> mod1 <- glmer (cbind(abs, total-abs) ~ trt + geog + count + trt:geog + trt:count + geog:count + (1|plot) + (1|trt:plot), family = binomial(link="cloglog"), data=abs2009)
> mod2 <- glmer (cbind(abs, total-abs) ~ trt + geog + count + trt:geog + trt:count + geog:count + (1|plot) + (1|trt:plot) + (1|obs), family = binomial(link="cloglog"), data=abs2009)
> anova(mod1,mod2) #overdispersion present

Data: abs2009
Models:
mod1: cbind(abs, total - abs) ~ trt + geog + count + trt:geog + trt:count + geog:count + (1 | plot) + (1 | trt:plot)
mod2: cbind(abs, total - abs) ~ trt + geog + count + trt:geog + trt:count + geog:count + (1 | plot) + (1 | trt:plot) + (1 | obs)
         Df    AIC    BIC       logLik  Chisq    Chi Df Pr(>Chisq)    
mod1 45 1832.7 2014.1 -871.35                             
mod2 46 1136.4 1321.8 -522.21 698.29      1  < 2.2e-16 ***


Step 2: Determine whether length of count period needs to be accounted for.

I chose to plot mod2 residuals as a function of dT:

> boxplot(mod2 at resid~abs2009$dT) # Plot shows no effect of dT (plot at tinyurl.com/dT-vs-resids)


Step 3:  Evaluate 2-way interactions:

I used LRT, but if the p-value was marginal, say <= 0.10, I would use parametric bootstrapping.  Sound reasonable?  

> drop1(mod2, test="Chisq") # None improve the model significantly; dropping all from model
Single term deletions

Model:
cbind(abs, total - abs) ~ trt + geog + count + trt:geog + trt:count + geog:count + (1 | plot) + (1 | trt:plot) + (1 | obs)
               Df      AIC     LRT       Pr(Chi)
<none>           1136.4                
trt:geog       1  1136.5  2.0470  0.1525
trt:count     13 1119.0  8.6243  0.8007
geog:count 13 1129.1 18.6701  0.1337


Step 4: Evaluate main effects not involved in retained interaction.

I evaluated them similarly to the interactions, above...

> drop1(mod3, test="Chisq") # trt insignificant; test geog and count via parametric bootstrapping
Single term deletions

Model:
cbind(abs, total - abs) ~ trt + geog + count + (1 | plot) + (1 | trt:plot) + (1 | obs)
       Df    AIC     LRT   Pr(Chi)    
<none>    1111.8                      
trt     1 1109.8   0.019  0.890959    
geog    1 1117.2   7.454  0.006331 ** 
count  13 1199.1 113.343 < 2.2e-16 ***


Step 5: Bootstrap main effects of geog and count; not shown


Any suggestions to improve this process?

Thanks for considering,

Adam Smith
Dept. Natural 
Resources Science
105 Coastal Institute in Kingston
University of Rhode Island
Kingston, RI 02881
Phone: 859-559-7172
Fax: 401-874-4561

 		 	   		  



More information about the R-sig-mixed-models mailing list