[R] Sampling with conditions

SarahJoyes sjoyes at uoguelph.ca
Tue Nov 8 21:24:33 CET 2011


Dan 
Nordlund, Dan (DSHS/RDA) wrote:
> 
>> -----Original Message-----
>> From: r-help-bounces@ [mailto:r-help-bounces at r-
>> project.org] On Behalf Of SarahJoyes
>> Sent: Tuesday, November 08, 2011 5:57 AM
>> To: r-help@
>> Subject: Re: [R] Sampling with conditions
>> 
>> That is exactly what I want, and it's so simple!
>> Thanks so much!
>> 
> 
> Sarah,
> 
> I want to point out that my post was qualified by "something like".  I am
> not sure it is exactly what you want.  Since you didn't quote my post, let
> me show my suggestion and then express my concern.
> 
> n <- matrix(0,nrow=5, ncol=10)
> repeat{
>   c1 <- sample(0:10, 4, replace=TRUE)
>   if(sum(c1) <= 10) break
> }
> n[,1] <- c(c1,10-sum(c1))
> n
> 
> This nominally meets your criteria, but it will tend to result in larger
> digits being under-represented.  For example, you unlikely to get a result
> like c(0,8,0,0,2) or (9,0,0,1,0).
> 
> That may be OK for your purposes, but I wanted to point it out.
> 
> You could use "something like" 
> 
> n <- matrix(0,nrow=5, ncol=10)
> c1 <- rep(0,4)
> for(i in 1:4){
>   upper <- 10-sum(c1)
>   c1[i] <- sample(0:upper, 1, replace=TRUE)
>   if(sum(c1) == 10) break
> }
> n[,1] <- c(c1,10-sum(c1))
> n
> 
> if that would suit your purposes better.
> 
> 
> Good luck,
> 
> Dan
> 
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA 98504-5204
> 
> 
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Perhaps a little bit of context may be helpful, 
I am trying to figure out the ideal age structure for a population of ten
individuals that would yield the best overall survival rate given that each
age group has different survivorbility and different reproductive rates. 
So yes, having a bias for smaller numbers would be a problem. The only other
problem that I see with your revised code is that there will be a bias
towards having higher numbers in the first age group or first row of the
column...
The other idea I was playing with was to create a series of ifelse
statements for each row of the column...
Something like:
n<-matrix(0,nr=5,ncol=10)
n[1,1]<-sample(0:10,1)
n[2,1]<-ifelse(n[1,1]=10,0,sample(0:10,1))
n[3,1]<-ifelse(sum(n[i,1])>10,0,sample(0:10,1))
etc...
I still think that might be biased towards high numbers in the first rows
though...
hmmm
SJ



--
View this message in context: http://r.789695.n4.nabble.com/Sampling-with-conditions-tp4014036p4017351.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list