[R] A question about sampling
Greg Snow
Greg.Snow at imail.org
Wed Feb 2 23:38:34 CET 2011
The apply functions are really just hidden loops, and loops have been made efficient enough that they are usually not much slower (and sometimes a bit faster) than the apply's.
If you really want to use apply, then look at mapply (might need to convert the matrix to a list), or you could use sapply on the vector 1:500 and write a function that indexes into the matrix and vector. But if you understand the loop, then I would suggest using the loop.
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Patrick Boily
> Sent: Wednesday, February 02, 2011 1:03 PM
> To: 'r-help at r-project.org'
> Subject: [R] A question about sampling
>
> Greetings,
>
> I am attempting to do something with R that I think should be
> efficiently do-able, but I haven't yet found success.
>
> I have a vector of probability weights (for 17 categories), let's call
> it things (it could look like the one below, for instance).
>
> > things
> 0.026 0 0.233 0 0.131 0 0.415 0 0 0 0 0 0.192 0 0.067 0 0
>
> I'd like a sample of size size.things (say, 47) of the 17 categories
> (with replacement). And I'd like to produce a vector of length 17 which
> enumerates the number of times each category has been selected. This is
> fairly straightforward to do; for instance:
>
> > things2<-
> table(factor(sample(1:17,size.things[1],replace=TRUE,prob=things),level
> s=1:17))
> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
> 1 0 9 0 4 0 18 0 0 0 0 0 5 0 4 0 0
>
> What would I need to do if I had a matrix things (50000 x 17) of
> probability weight vectors and a vector of sample sizes size.things (of
> length 50000), and I wanted to simultaneously sample size.things[1] of
> the 17 categories with probability weight vector things[1,],
> size.things[2] of the 17 categories with probability weight vector
> things[2,], etc. A loop will do the trick, but it takes a while and it
> seems to me that I could more efficiently use tapply somehow. Or
> something that behaves like rowSums. I'm not familiar enough with R to
> see an easy way out. Perhaps there isn't? Does anybody have an idea?
>
> Regards,
>
> Patrick
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list