[R] simple randomization question: How to perform "sample" in chunks

Greg Snow Greg.Snow at imail.org
Thu Aug 20 19:04:08 CEST 2009


Here is a one liner:

(yy <- do.call( rbind, sample( split(xx, xx$a) ) ))

Basically reading from inside out, it splits the data frame by a (keeping the structure of b intact within each data frame) and returns it as a list, then that list is randomized, then put back together into a single data frame again.

Does this do what you want?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Tal Galili
> Sent: Thursday, August 20, 2009 9:22 AM
> To: r-help at r-project.org
> Subject: [R] simple randomization question: How to perform "sample" in
> chunks
> 
> Hello dear R-help group.
> 
> My task looks simple, but I can't seem to find a "smart" (e.g: non
> loop)
> solution to it.
> 
> Task: I wish to randomize a data.frame by one column, while keeping the
> inner-order in the second column as is.
> 
> So for example, let's say I have the following data.frame:
> 
> xx <-data.frame(a=  c(1,2,2,3,3,3,4,4,4,4) ,
>                         b =  c(1,1,2,1,2,3,1,2,3,4) )
> 
> I would like to shuffle it by column "a", while keeping the order in
> column
> "b".
> 
> Here is my "not-smart" way of doing it:
> 
> # R example
> xx <-data.frame(a=  c(1,2,2,3,3,3,4,4,4,4) ,
>                         b =  c(1,1,2,1,2,3,1,2,3,4) )
> 
> randomize.by.column.a <- function(xx)
> {
> new.a.order <- sample(unique(xx$a))
> new.xx <- NULL
> for(i in new.a.order)
> {
>   xx.subset <- xx[ xx$a %in% i ,]
>   new.xx <- rbind(new.xx ,  xx.subset)
> }
> 
> return(new.xx)
> }
> randomize.by.column.a(xx)
> # END of - R example
> 
> 
> 
> I would love for a better, faster, way of doing it.
> 
> Thanks,
> Tal
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --
> ----------------------------------------------
> 
> 
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> http://www.r-statistics.com/
> http://www.talgalili.com
> http://www.biostatistics.co.il
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list