[R] bootstrap sample for clustered data

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Sun Sep 16 22:39:24 CEST 2018


(I neglected to cc this to the list -- Bert)


On Sun, Sep 16, 2018 at 1:36 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:

> You can do a mixed effects model using the existing id's without recoding.
>
> But if you insist, is this the sort of thing you want?
>
> set.seed(-12345) # for reprodicibility
>
> id <- factor(sample(2:5, 10, rep=TRUE))
> id
> new.id <- factor(id,labels = seq_along(levels(id)))
> new.id
>
> Note: There's a slightly slicker way to do this, but it bypasses the
> factor() API, and I prefer not to do that.
>
> Cheers,
> Bert
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Sep 16, 2018 at 12:52 PM Liu, Lei <lei.liu using wustl.edu> wrote:
>
>> Sorry for the confusion. I just want to recode the id variable to 1 to 5
>> in the bootstrapped sample. This way I can do e.g., a mixed effects model
>> using the new id as the cluster. Thanks!
>>
>> Lei
>>
>>
>>
>> *From:* Bert Gunter [mailto:bgunter.4567 using gmail.com]
>> *Sent:* Sunday, September 16, 2018 2:21 PM
>> *To:* Liu, Lei <lei.liu using wustl.edu>
>> *Cc:* R-help <r-help using r-project.org>
>> *Subject:* Re: [R] bootstrap sample for clustered data
>>
>>
>>
>> I can't make any sense of your post. Id 3 occurs 6 times, and 2 and 5
>> occur twice each in your example.. How do you get (1,1,2,2,3,3,4,4,5,5) out
>> of that? In other words, specify the mapping of old id's to new.
>>
>>
>>
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>>
>>
>>
>> On Sun, Sep 16, 2018 at 11:51 AM Liu, Lei <lei.liu using wustl.edu> wrote:
>>
>> Hi there,
>>
>> I tried to generate bootstrap samples for clustered data. Here is some
>> code I found in the web to do the work:
>>
>> id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
>> y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
>> x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )
>>
>> xx=data.frame(id, x, y)
>>
>> boot.cluster <- function(x, id){
>>
>>   boot.id <- sample(unique(id), replace=T)
>>   out <- lapply(boot.id, function(i) x[id%in%i,])
>>
>>   return( do.call("rbind",out) )
>>
>> }
>>
>> boot.pro=boot.cluster(xx, xx$id)
>>
>> Now I have the output
>>
>>    id x   y
>> 5   3 0 0.4
>> 6   3 0 1.0
>> 51  3 0 0.4
>> 61  3 0 1.0
>> 9   5 1 0.5
>> 10  5 1 2.0
>> 52  3 0 0.4
>> 62  3 0 1.0
>> 3   2 1 0.4
>> 4   2 1 0.3
>>
>> However, the id variable is the original id, while I want to take the new
>> id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me
>> how to do it? Of note, the same original id may have duplicates since the
>> bootstrap sample is drawn with replacement. Thanks a lot!
>>
>> Lei
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

	[[alternative HTML version deleted]]




More information about the R-help mailing list