[R] bootstrap sample for clustered data
Liu, Lei
lei@liu @ending from wu@tl@edu
Sun Sep 16 19:39:41 CEST 2018
Hi there,
I tried to generate bootstrap samples for clustered data. Here is some code I found in the web to do the work:
id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )
xx=data.frame(id, x, y)
boot.cluster <- function(x, id){
boot.id <- sample(unique(id), replace=T)
out <- lapply(boot.id, function(i) x[id%in%i,])
return( do.call("rbind",out) )
}
boot.pro=boot.cluster(xx, xx$id)
Now I have the output
id x y
5 3 0 0.4
6 3 0 1.0
51 3 0 0.4
61 3 0 1.0
9 5 1 0.5
10 5 1 2.0
52 3 0 0.4
62 3 0 1.0
3 2 1 0.4
4 2 1 0.3
However, the id variable is the original id, while I want to take the new id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me how to do it? Of note, the same original id may have duplicates since the bootstrap sample is drawn with replacement. Thanks a lot!
Lei
[[alternative HTML version deleted]]
More information about the R-help
mailing list