[R-sig-ME] GLMM Power Analysis

Tue Apr 21 22:54:52 CEST 2015

I am attempting to do a simulation-based power analysis for
a study I will be conducting which will be analysed using GLMM.  

My study will focus on assessing different sampling and
analytic techniques for evaluating the number of individuals present in a water sample.  I will focus on two size classes of organisms: large and small.  For
the large size class, I will be analyzing 3 different sampling devices and 6
analytic techniques.  Devices can only be tested in pairs but the analytic
techniques will all be used with 3 replicate subsamples for each sample. 
For the small size class, I will not be testing a sampling device but
instead looking at the number of individuals present before and after
treatment.  For this size class, I have 10 different analytic techniques
which will be run in triplicate for each sample. The experiment will be conducted in the field, so the number of individuals actually present in the water will vary between trials. 

I understand how to implement the example given on the R
GLMM help page.  The stumbling block that I am having is that
I expect that there will be an interaction between my fixed effects (analytic methods
and sampling methods (where applicable)) and the random effect for trial (which
will affect the number of individuals actually present in the sample). 
For example, I anticipate that rather than each analytic device missing a set
number of individuals, each device will capture a given proportion of the
individuals present in the trial (e.g. 90%, 80%, 70%...).  Similarly, I expect that each sampling device will capture a given proportion of the individuals, so these will be multiplicative (i.e. b0*% captured by sampling device*% measured by analytic device).  
I am
not sure how to set up  the GLMM for the power analysis
in this case or how to look at the power when there are so many devices (i.e.
is that trial ‘significant’ if just 1 device is significantly different, or should
differences for all devices be significant?).

 Based on the GLMM help page, I have constructed the following for a simple version (6 analytic methods, no sampling devices, no treatment).  However, I am not sure if I am violating assumptions by setting it up in this way or whether I am pulling the appropriate statistics given the large number of devices that I am testing (i.e. is any significant effect of method the best statistic to look at?).  Any comments, assistance, or helpful resources would be very much appreciated. 
---------------------------------------------------------------------------------------------------------------------------------------------------------------
sim1 <- function(b0=1000,                 bmethod2=0.9,bmethod3=0.8,bmethod4=0.7,bmethod5=0.85,bmethod6=0.9,                 vtrial=10000, vsample=25, Verror=20) {  trial <- rep( 1:30, each=3*6)  replicate<-rep(c(1,2,3),length.out=30*6*3)  sample<-rep(1:180,each=3)  method<-rep(rep(1:6,each=3),length.out=540)    t.re <- rpois(30, sqrt(vtrial))   # random effects per trial  S.re <- rpois(180, sqrt(vsample))   # random effects per sample  eps <- rpois(30*18, sqrt(Verror))  # epsilons    # put it all together  Number <- b0 *     (1*(method=="1")+bmethod2*(method=="2") + bmethod3*(method=="3") +    bmethod4*(method=="4") + bmethod5*(method=="5")+    bmethod6*(method=="6")) +    S.re[sample] + t.re[trial] + eps    # put into a data frame  mydata <<- data.frame( trial = trial,                         replicate=paste('r',replicate, sep=''),                        sample=sample,                         method=method,                        Number = Number)
  mydata$trial=as.factor(mydata$trial)  mydata$method=as.factor(mydata$method)  mydata$sample=as.factor(mydata$sample)
  #look at whether difference in methods is detected  fit1 <- glmer( Number ~ method+ (1|trial)+ (1|sample), data=mydata, family="poisson")  fit2 <- glmer( Number ~ (1|trial)+ (1|sample), data=mydata,family="poisson")  anova(fit2,fit1)[2,8]}
out1 <- replicate( 100, {setWinProgressBar(pb, getWinProgressBar(pb)+1);                         sim1( bSex=10, bFreq=2, bSF=0.25, Vsub=4000, Vword=2500, Verror=10000)})hist(out1)mean( out1 < 0.05 )

	[[alternative HTML version deleted]]