[R] cohort sampling
Terry Therneau
therneau at mayo.edu
Tue Jul 1 14:55:00 CEST 2008
> Now that we have case cohort model , we have 1000 people and 50 cases
> Let the first 10 cases occur at the same time
> second 10 "
> third 10 "
> fourth 10 "
> fifth 10 "
> How easy is it to randomly sample 50 different
> cohort controls for each group?
>That is:
>randomly sample 50 cohort controls for the first 10 cases from all 1,000
>randomly sample 50 new cohort controls for the second 10 cases from the
> surviving 990
...
---------------
Your message actually describes a nested control design, a case-cohort design
would sample from all subjects at the start of the study. Note that it is
important in these designs to not look into the future, that is, someone who
becomes a case at time t+s is still eligible to be a control at time t.
Here is some sample code, I am sure that others can do better. Assume
variables 'time' = follow-up time for each subject, status = 1 if there was an
event at the last follow-up, and x1, x2 are covariates. Assume time>0 for all
subjects.
n <- length(time)
casetime <- unique(time[status==1]) # all the event times
chosen <- rep(0,n) # marks the case and control groups
for (i in casetime) {
cases <- (time==i & status==1)
potential <- (1:n)[!cases & chosen==0 & time >=i] #potential controls
new.control <- sample(potential, 50) # sample 50 of them
chosen[new.control] <- i # remember who was chosen
chosen[cases] <- i # link them to the right case
}
fit <- coxph(Surv(time, status) ~ x1 + x2 + strata(chosen),
subset= (chosen > 0))
Terry Therneau
More information about the R-help
mailing list