[R] Complex sampling?

Jonathan P Daily jdaily at usgs.gov
Wed Mar 9 21:32:43 CET 2011


--------------------------------------
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
     - Jubal Early, Firefly

r-help-bounces at r-project.org wrote on 03/09/2011 01:01:32 PM:

> [image removed] 
> 
> [R] Complex sampling?
> 
> Hosack, Michael 
> 
> to:
> 
> r-help at R-project.org
> 
> 03/09/2011 01:04 PM
> 
> Sent by:
> 
> r-help-bounces at r-project.org
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org]
> > On Behalf Of Hosack, Michael
> > Sent: Wednesday, March 09, 2011 7:34 AM
> > To: r-help at R-project.org
> > Subject: [R] Complex sampling?
> > 
> > R users,
> > 
> > I am trying to generate a randomized weekday survey schedule that 
ensures
> > even coverage of weekdays in
> > the sample, where the distribution of variable DOW is random with 
respect
> > to WEEK. To accomplish this I need
> > to randomly sample without replacement two weekdays per week for each 
of
> > 27 weeks (only 5 are shown). 
> 
> This seems simple enough, sampling without replacement.
> 
> However,
> > I need to sample from a sequence (3:7) that needs to be completely
> > depleted and replenished until the
> > final selection is made. Here is an example of what I want to do,
> > beginning at WEEK 1. I would prefer to do
> > this without using a loop, if possible.
> > 
> > sample frame: [3,4,5,6,7] --> [4,5,6] --> [4],[1,2,3,(4),5,6] -->
> > [1,2,4,5,6] --> for each WEEK in dataframe
> 
> OK, now you have me completely lost.  Sorry, but I have no clue as 
> to what you just did here.  I looks like you are trying to describe 
> some transformation/algorithm but I don't follow it.
> 
> 
> 
> I could not reply to this email because it not been delivered to my 
> inbox, so I had to copy it from the forum. 
> I apologize for the confusion, this would take less than a minute to
> explain in conversation but an hour 
> to explain well in print. Two DOW_NUMs will be selected randomly 
> without replacement from the vector 3:7 for each WEEK. When this 
> vector is reduced to a single integer that integer will be selected

This is what doesn't make sense to me. You sample two values from 3:7 
without replacement. Then this sample of two is turned into one by a 
mechanism you have not specified. Are these averaged? Lets assume you 
assign this mechanism to fun(x1) where x1 is the sample.
 
> and the vector will be restored and a single integer will then be 
> selected that differs from the prior selected integer (i.e. cannot 
> sample the same day twice in the same week).

so then you want to sample twice from 3:7 again such that fun(x2) != 
fun(x1)?

If so, then this might be what you need.

vals <- matrix(,5,2)

pairs <- expand.grid(3:7, 3:7)

for(i in 1:5)
{
        vals[i,1] <- fun(sample(pairs))
        vals[i,2] <- fun(sample(pairs[fun(pairs) != vals[i,1])))
}

Or something similar.

This process will be 
> repeated until two DOW_NUM have been assigned for each WEEK. That 
> process is what I attempted to illustrate in my original message. 
> This is beyond my current coding capabilities. 
> 
> 
> 
> > 
> > Randomly sample 2 DOW_NUM without replacement from each WEEK ( () = no 
two
> > identical DOW_NUM can be sampled
> > in the same WEEK)
> > 
> > sample = {3,7}, {5,6}, {4,3}, {1,5}, --> for each WEEK in dataframe
> > 
> 
> So, are you sampling from [3,4,5,6,7], or [1,2,4,5,6], or ...?  Can 
> you show an 'example' of what you would like to end up given your data 
below?
> 
> > 
> > Thanks you,
> > 
> > Mike
> > 
> > 
> >          DATE DOW DOW_NUM WEEK
> > 2  2011-05-02 Mon       3    1
> > 3  2011-05-03 Tue       4    1
> > 4  2011-05-04 Wed       5    1
> > 5  2011-05-05 Thu       6    1
> > 6  2011-05-06 Fri       7    1
> > 9  2011-05-09 Mon       3    2
> > 10 2011-05-10 Tue       4    2
> > 11 2011-05-11 Wed       5    2
> > 12 2011-05-12 Thu       6    2
> > 13 2011-05-13 Fri       7    2
> > 16 2011-05-16 Mon       3    3
> > 17 2011-05-17 Tue       4    3
> > 18 2011-05-18 Wed       5    3
> > 19 2011-05-19 Thu       6    3
> > 20 2011-05-20 Fri       7    3
> > 23 2011-05-23 Mon       3    4
> > 24 2011-05-24 Tue       4    4
> > 25 2011-05-25 Wed       5    4
> > 26 2011-05-26 Thu       6    4
> > 27 2011-05-27 Fri       7    4
> > 30 2011-05-30 Mon       3    5
> > 31 2011-05-31 Tue       4    5
> > 32 2011-06-01 Wed       5    5
> > 33 2011-06-02 Thu       6    5
> > 34 2011-06-03 Fri       7    5
> > 
> > DF <-
> > structure(list(DATE = structure(c(15096, 15097, 15098, 15099,
> > 15100, 15103, 15104, 15105, 15106, 15107, 15110, 15111, 15112,
> > 15113, 15114, 15117, 15118, 15119, 15120, 15121, 15124, 15125,
> > 15126, 15127, 15128), class = "Date"), DOW = c("Mon", "Tue",
> > "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri", "Mon",
> > "Tue", "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri",
> > "Mon", "Tue", "Wed", "Thu", "Fri"), DOW_NUM = c(3, 4, 5, 6, 7,
> > 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7),
> >     WEEK = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
> >     4, 4, 4, 4, 5, 5, 5, 5, 5)), .Names = c("DATE", "DOW", "DOW_NUM",
> > "WEEK"), row.names = c(2L, 3L, 4L, 5L, 6L, 9L, 10L, 11L, 12L,
> > 13L, 16L, 17L, 18L, 19L, 20L, 23L, 24L, 25L, 26L, 27L, 30L, 31L,
> > 32L, 33L, 34L), class = "data.frame")
> > 
> 
> Dan
> 
> Daniel Nordlund
> Bothell, WA USA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list