[R] generate 3 distinct random samples without replacement

Duncan Murdoch murdoch.duncan at gmail.com
Mon Mar 7 21:52:27 CET 2011


On 07/03/2011 2:17 PM, Cesar Hincapié wrote:
> Hello:
>
> I wonder if I could get a little help with random sampling in R.
>
> I have a vector of length 7375.  I would like to draw 3 distinct random samples, each of length 100 without replacement.  I have tried the following:
>
> d1<- 1:7375
>
> set.seed(7)
> i<- sample(d1, 100, replace=F)
> s1<- sort(d1[i])
> s1
>
> d2<- d1[-i]
> set.seed(77)
> j<- sample(d2, 100, replace=F)
> s2<- sort(d2[j])
> s2
>
> d3<- d2[-j]
> set.seed(777)
> k<- sample(d3, 100, replace=F)
> s3<- sort(d3[k])
> s3
>
> D<- data.frame(a=s1,b=s2,c=s3)
>
>
> However, s2 is only 97 elements long, and s3, only 96 long.
>
> I would appreciate any suggestions on a better approach.
> I'm also curious to know why my second and third samples are less than 100 elements in length.

If you want 3 non-overlapping, non-repeating samples of 100, why not 
draw one sample of 300, and take 3 subsets of it?

The reason you were finding shorter samples is because you were using j 
and k as indices into vectors d2 and d3 that didn't have enough 
elements, and then you sorted the result, losing the NAs.  For example,

d2 <- 1:10
d2[10:12]
sort(d2[10:12])

See ?sort for an explanation of how to keep NA values when you sort.

Duncan Murdoch

> Thanks for your time and consideration,
>
> Cesar A. Hincapié, DC, MHSc
>
> Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute
> PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto
> e. cesar.hincapie at utoronto.ca
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list