[R] Simulate data with binary outcome
Steve Frost
S.Frost at uws.edu.au
Wed Jul 16 07:40:24 CEST 2008
Dear R-Users,
I wish to simulate a binary outcome data set with
predictors (in the example below, age, sex and systolic BP). Is there a
way I can set the frequency of the outcome (y) to be say 5% (versus the
0.1% when using the seed below)?
# Example R-code based on Frank Harrell's Design help files
library(Hmisc)
n <- 1000
set.seed(123456)
age <- runif(n, 60, 90)
sbp <- rnorm(n, 120, 15)
sex <- factor(sample(c('female','male'), n,TRUE))
# Specify population model for log odds that CHD = Yes
L <- 0.4*(sex == 'male') +
0.045*(age) +
0.05*(sbp)
# Simulate binary y to have Prob(y = 1) = 1/[1+exp(-L)]
y <- ifelse(runif(n) < plogis(L), 1, 0)
table(y)
ddist <- datadist(sex,age,sbp)
options(datadist = 'ddist')
fit <- lrm(y ~ sex + age + sbp)
summary(fit)
================================
Steve Frost MPH
University of Western Sydney
Building 7
Campbelltown Campus
Locked Bag 1797
PENRITH SOUTH DC 1797
Phone 61+ 2 4620 3415
Mobile 0407 291088
Fax 61+ 2 4625 4252
================================
More information about the R-help
mailing list