[R] dataframe, simulating data

Jim Lemon jim at bitwrit.com.au
Fri Dec 31 12:07:56 CET 2010


On 12/31/2010 08:51 PM, Sarah wrote:
>
> Dear all,
>
> I'm having trouble with my dataframe, and I hope someone can help me out...
> I have data from 40 subjects, displayed in a dataframe. I have randomly
> assigned subjects to group 1 or 0 (mar.y==0 or mar.y==1, with probabilities
> used).
> In the end, I want 34 cases assigned to group 0, with the rest of the
> subjects assigned to group 1. However, if there are more than 34 cases
> assigned to group 0 due to the randomness, I would like to keep 34 cases in
> group 0 (this is already written in my script below), but with the rest of
> the cases assigned to group 1. (Vice versa, if there are less than 34 cases
> assigned to group 0, I would like to sample cases from group 1 and put them
> in group 0, while retaining the rest of group 1 in my dataframe.)
> I can't figure out how to keep 34 cases in group 0, WHILE assigning the rest
> of the cases a value 1 (mar.y==1)...
>
> if (length(which(df$mar.y==0))>34) {
> df<- df[sample(which(df$mar.y==0),34), ]
>   } else {
>   df<- df[c(which(df$mar.y==0),
> sample(which(df$mar.y==1),34-length(which(df$mar.y==0)))), ]
> }
>
> (I'm aware that using this script is not the most elegant way to solve the
> problem, but because this script is part of a larger design, I have to stick
> to this example.)

Hi Sarah,
Why not just use sample to select 34 of your cases from the 40?

df$mar.y<-1
df$mar.y[sample(1:40,34)]<-0

Jim



More information about the R-help mailing list