[R] dataframe, simulating data

Petr Savicky savicky at cs.cas.cz
Fri Dec 31 14:56:35 CET 2010


On Fri, Dec 31, 2010 at 05:05:08AM -0800, Sarah wrote:
> 
> I'm sorry, I don't think I've made myself clear enough.
> 
> Cases have been randomly assigned to one of the two groups, with certain
> probabilities (based on other variables).
> So, if there are too many people (i.e., more than 34) assigned to group 0, I
> would like to sample 34 cases from group 0, and give the rest of the cases a
> value 1. My dataframe would contain 40 cases; 34 with mar.y==0 and the rest
> given (or some already had) a value mar.y==1.
> If, however, too few cases have been assigned to group 0, I need to randomly
> select cases from group 1 and put them in group 0 (i.e., give them a value
> 0). My dataframe would contain the previous selected cases (mar.y==0), PLUS
> cases from group 1 who are now assigned to group 0 (mar.y==0), PLUS the
> remaining cases who stayed in group 1 (mar.y==1). 
> (In other words, how can I change the value for df$mar.y from 1 to 0 or vice
> versa for some cases)?

For example

  old <- df$mar.y # just to check the change
  
  ind0 <- which(df$mar.y==0)
  ind1 <- which(df$mar.y==1)
  if (length(ind0) > 34) {
      df$mar.y[sample(ind0, length(ind0) - 34)] <- 1
  } else {
      df$mar.y[sample(ind1, 34 - length(ind0))] <- 0
  }
  
  table(old, new=df$mar.y) # just to check the change

Does this work in your situation?

Petr Savicky.



More information about the R-help mailing list