[R] hurdle, simulated power
David Atkins
datkins at u.washington.edu
Wed May 4 17:13:30 CEST 2011
Hi all--
We are planning an intervention study for adolescent alcohol use, and I
am planning to use simulations based on a hurdle model (using the
hurdle() function in package pscl) for sample size estimation.
The simulation code and power code are below -- note that at the moment
the "power" code is just returning the coefficients, as something isn't
working quite right.
The average estimates from code below are:
count_(Intercept) count_trt zero_(Intercept)
2.498327128 -0.000321315 0.910293501
zero_trt
-0.200134813
Three of the four look right (ie, converging to population values), but
the count_trt is stuck at zero, regardless of sample size (when it
should be ~ -0.20).
Does anyone see what's wrong?
Thanks for any input.
cheers, Dave
mysim <- function(n, beta0, beta1, alpha0, alpha1, theta){
trt <- c(rep(0,n), rep(1,n))
### mean function logit model
p0 <- exp(alpha0 + alpha1*trt)/(1 + exp(alpha0 + alpha1*trt))
### 0 / 1 based on p0
y1 <- as.numeric(runif(n)>p0)
### mean function count portion
mu <- exp(beta0 + beta1*trt)
### estimate counts using NB dist
require(MASS, quietly = TRUE)
y2 <- rnegbin(n, mu = mu, theta = theta)
### if y2 = 0, draw new value
while(sum(y2==0)>0){
y2[which(y2==0)] <- rnegbin(length(which(y2==0)), mu=mu[which(y2==0)],
theta = theta)
}
y<-y1*y2
data.frame(trt=trt,y=y)
}
#alpha0, alpha1 is the parameter for zero part
#beta0,beta1 is the parameter for negative binomial
#theta is dispersion parameter for negative binomial, infinity
correspond to poisson
#
#example power analysis
#return three power, power1 for zero part, power2 for negative binomial part
#power3 for joint test,significance level can be set, default is 0.05
#M is simulation time
#require pscl package
#library(pscl)
mypower <- function(n, beta0, beta1, alpha0, alpha1, theta,
siglevel=0.05, M=1000){
myfun <- function(n,beta0,beta1,alpha0,alpha1,theta,siglevel){
data <- mysim(n,beta0,beta1,alpha0,alpha1,theta)
require(pscl, quietly = TRUE)
res <- hurdle(y ~ trt, data = data, dist = "negbin", trace = FALSE)
est <- coef(res)#[c(2,4)]
#v<-res$vcov[c(2,4),c(2,4)]
#power1<-as.numeric(2*pnorm(-abs(est)[2]/sqrt(v[2,2]))<siglevel)
#power2<-as.numeric(2*pnorm(-abs(est)[1]/sqrt(v[1,1]))<siglevel)
#power3<-as.numeric((1-pchisq(t(est)%*%solve(v)%*%est,df=2))<siglevel)
#c(power1,power2,power3)
}
r <- replicate(M, myfun(n,beta0,beta1,alpha0,alpha1,theta,siglevel),
simplify=TRUE)
apply(r, 1, mean)
}
out <- mypower(n = 1000, beta0 = 2.5, beta1 = -0.20,
alpha0 = -0.90, alpha1 = 0.20,
theta = 2.2, M = 100)
out
--
Dave Atkins, PhD
Research Associate Professor
Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu
Center for the Study of Health and Risk Behaviors (CSHRB)
1100 NE 45th Street, Suite 300
Seattle, WA 98105
206-616-3879
http://depts.washington.edu/cshrb/
(Mon-Wed)
Center for Healthcare Improvement, for Addictions, Mental Illness,
Medically Vulnerable Populations (CHAMMP)
325 9th Avenue, 2HH-15
Box 359911
Seattle, WA 98104
http://www.chammp.org
(Thurs)
More information about the R-help
mailing list