[R] Conditional sampling

Duncan Murdoch murdoch.duncan at gmail.com
Thu Feb 10 17:18:03 CET 2011


On 10/02/2011 11:02 AM, Hosack, Michael wrote:
> R experts,
>
> I need to sample two rows without replacement from the following data frame such that
> neither row contains the same 'DOW'. For example, I cannot select both a Monday morning
> and a Monday afternoon. I am using STRATA_NUM as an index to randomly select rows from,
> since this variable indexes all unique permutations of DOW, SITE, and TOD. I know how to
> use the sample function to select rows, I just don't know how to sample with a constraint.
> This seems simple, but I can't seem to find a simple solution. Any help would be greatly
> appreciated.

I would use a rejection sampler, since your constraint is met for most 
samples.  That is, sample without the constraint then
throw away any samples that violate it.

Duncan Murdoch

> Thank you,
>
> Mike
>
>
>
> DF<-
> structure(list(DOW = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L), .Label = c("Fri", "Mon", "Thu", "Tue", "Wed"), class = "factor"),
>      SITE = c(101L, 101L, 102L, 102L, 103L, 103L, 104L, 104L,
>      101L, 101L, 102L, 102L, 103L, 103L, 104L, 104L, 101L, 101L,
>      102L, 102L, 103L, 103L, 104L, 104L, 101L, 101L, 102L, 102L,
>      103L, 103L, 104L, 104L, 101L, 101L, 102L, 102L, 103L, 103L,
>      104L, 104L), TOD = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
>      2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
>      1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
>      2L, 1L, 2L), .Label = c("Aftn", "Morn"), class = "factor"),
>      STRATA_NUM = c(2L, 1L, 12L, 11L, 22L, 21L, 32L, 31L, 4L,
>      3L, 14L, 13L, 24L, 23L, 34L, 33L, 6L, 5L, 16L, 15L, 26L,
>      25L, 36L, 35L, 8L, 7L, 18L, 17L, 28L, 27L, 38L, 37L, 10L,
>      9L, 20L, 19L, 30L, 29L, 40L, 39L), DATE = structure(c(1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>      3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
>      4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("5/3/2010",
>      "5/4/2010", "5/5/2010", "5/6/2010", "5/7/2010"), class = "factor"),
>      DOW_NUM = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
>      4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
>      6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), WEEK = c(1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("DOW", "SITE",
> "TOD", "STRATA_NUM", "DATE", "DOW_NUM", "WEEK"), class = "data.frame", row.names = c(NA,
> -40L))
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list