[R] vectorized approach to cumulative sampling
Rich FitzJohn
rich.fitzjohn at gmail.com
Thu Apr 7 23:47:44 CEST 2005
Hi,
sample() takes a "replace" argument, so you can take large samples,
with replacement, like this: (In the sample() call, the
50*target/mean(old) should make it sample 50 times more than likely.
This means the while loop will probably get executed only once. This
could be tuned easily, and there may be better ways of guessing how
much to take).
old <- c(1:2000)
p <- runif(1:2000)
target <- 4000
new <- 0
while ( sum(new) < target )
new <- sample(old, 50*target/mean(old), TRUE, p)
i <- which(cumsum(new) >= target)[1]
new <- new[1:i]
new[i] <- new[i] - (sum(new)-target)
Cheers,
Rich
On Apr 8, 2005 9:19 AM, Daniel E. Bunker <deb37 at columbia.edu> wrote:
> Hi All,
>
> I need to sample a vector ("old"), with replacement, up to the point
> where my vector of samples ("new") sums to a predefined value
> ("target"), shortening the last sample if necessary so that the total
> sum ("newsum") of the samples matches the predefined value.
>
> While I can easily do this with a "while" loop (see below for example
> code), because the length of both "old" and "new" may be > 20,000, a
> vectorized approach will save me lots of CPU time.
>
> Any suggestions would be greatly appreciated.
>
> Thanks, Dan
>
> # loop approach
> old=c(1:10)
> p=runif(1:10)
> target=20
>
> newsum=0
> new=NULL
> while (newsum<target) {
> i=sample(old, size=1, prob=p);
> new[length(new)+1]=i;
> newsum=sum(new)
> }
> new
> newsum
> target
> if(newsum>target){new[length(new)]=target-sum(new[-length(new)])}
> new
> newsum=sum(new); newsum
> target
>
--
Rich FitzJohn
rich.fitzjohn <at> gmail.com | http://homepages.paradise.net.nz/richa183
You are in a maze of twisty little functions, all alike
More information about the R-help
mailing list