[R] Complex sampling?

Hosack, Michael mhosack at state.pa.us
Wed Mar 9 19:01:32 CET 2011


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Hosack, Michael
> Sent: Wednesday, March 09, 2011 7:34 AM
> To: r-help at R-project.org
> Subject: [R] Complex sampling?
> 
> R users,
> 
> I am trying to generate a randomized weekday survey schedule that ensures
> even coverage of weekdays in
> the sample, where the distribution of variable DOW is random with respect
> to WEEK. To accomplish this I need
> to randomly sample without replacement two weekdays per week for each of
> 27 weeks (only 5 are shown). 

This seems simple enough, sampling without replacement.

However,
> I need to sample from a sequence (3:7) that needs to be completely
> depleted and replenished until the
> final selection is made. Here is an example of what I want to do,
> beginning at WEEK 1. I would prefer to do
> this without using a loop, if possible.
> 
> sample frame: [3,4,5,6,7] --> [4,5,6] --> [4],[1,2,3,(4),5,6] -->
> [1,2,4,5,6] --> for each WEEK in dataframe

OK, now you have me completely lost.  Sorry, but I have no clue as to what you just did here.  I looks like you are trying to describe some transformation/algorithm but I don't follow it.



I could not reply to this email because it not been delivered to my inbox, so I had to copy it from the forum. 
I apologize for the confusion, this would take less than a minute to explain in conversation but an hour 
to explain well in print. Two DOW_NUMs will be selected randomly without replacement from the vector 3:7 for each WEEK. When this vector is reduced to a single integer that integer will be selected and the vector will be restored and a single integer will then be selected that differs from the prior selected integer (i.e. cannot sample the same day twice in the same week). This process will be repeated until two DOW_NUM have been assigned for each WEEK. That process is what I attempted to illustrate in my original message. This is beyond my current coding capabilities. 



> 
> Randomly sample 2 DOW_NUM without replacement from each WEEK ( () = no two
> identical DOW_NUM can be sampled
> in the same WEEK)
> 
> sample = {3,7}, {5,6}, {4,3}, {1,5}, --> for each WEEK in dataframe
> 

So, are you sampling from [3,4,5,6,7], or [1,2,4,5,6], or ...?  Can you show an 'example' of what you would like to end up given your data below?

> 
> Thanks you,
> 
> Mike
> 
> 
>          DATE DOW DOW_NUM WEEK
> 2  2011-05-02 Mon       3    1
> 3  2011-05-03 Tue       4    1
> 4  2011-05-04 Wed       5    1
> 5  2011-05-05 Thu       6    1
> 6  2011-05-06 Fri       7    1
> 9  2011-05-09 Mon       3    2
> 10 2011-05-10 Tue       4    2
> 11 2011-05-11 Wed       5    2
> 12 2011-05-12 Thu       6    2
> 13 2011-05-13 Fri       7    2
> 16 2011-05-16 Mon       3    3
> 17 2011-05-17 Tue       4    3
> 18 2011-05-18 Wed       5    3
> 19 2011-05-19 Thu       6    3
> 20 2011-05-20 Fri       7    3
> 23 2011-05-23 Mon       3    4
> 24 2011-05-24 Tue       4    4
> 25 2011-05-25 Wed       5    4
> 26 2011-05-26 Thu       6    4
> 27 2011-05-27 Fri       7    4
> 30 2011-05-30 Mon       3    5
> 31 2011-05-31 Tue       4    5
> 32 2011-06-01 Wed       5    5
> 33 2011-06-02 Thu       6    5
> 34 2011-06-03 Fri       7    5
> 
> DF <-
> structure(list(DATE = structure(c(15096, 15097, 15098, 15099,
> 15100, 15103, 15104, 15105, 15106, 15107, 15110, 15111, 15112,
> 15113, 15114, 15117, 15118, 15119, 15120, 15121, 15124, 15125,
> 15126, 15127, 15128), class = "Date"), DOW = c("Mon", "Tue",
> "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri", "Mon",
> "Tue", "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri",
> "Mon", "Tue", "Wed", "Thu", "Fri"), DOW_NUM = c(3, 4, 5, 6, 7,
> 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7),
>     WEEK = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
>     4, 4, 4, 4, 5, 5, 5, 5, 5)), .Names = c("DATE", "DOW", "DOW_NUM",
> "WEEK"), row.names = c(2L, 3L, 4L, 5L, 6L, 9L, 10L, 11L, 12L,
> 13L, 16L, 17L, 18L, 19L, 20L, 23L, 24L, 25L, 26L, 27L, 30L, 31L,
> 32L, 33L, 34L), class = "data.frame")
> 

Dan

Daniel Nordlund
Bothell, WA USA



More information about the R-help mailing list