[R] simple randomization question: How to perform "sample" in chunks
Greg Snow
Greg.Snow at imail.org
Thu Aug 20 19:04:08 CEST 2009
Here is a one liner:
(yy <- do.call( rbind, sample( split(xx, xx$a) ) ))
Basically reading from inside out, it splits the data frame by a (keeping the structure of b intact within each data frame) and returns it as a list, then that list is randomized, then put back together into a single data frame again.
Does this do what you want?
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Tal Galili
> Sent: Thursday, August 20, 2009 9:22 AM
> To: r-help at r-project.org
> Subject: [R] simple randomization question: How to perform "sample" in
> chunks
>
> Hello dear R-help group.
>
> My task looks simple, but I can't seem to find a "smart" (e.g: non
> loop)
> solution to it.
>
> Task: I wish to randomize a data.frame by one column, while keeping the
> inner-order in the second column as is.
>
> So for example, let's say I have the following data.frame:
>
> xx <-data.frame(a= c(1,2,2,3,3,3,4,4,4,4) ,
> b = c(1,1,2,1,2,3,1,2,3,4) )
>
> I would like to shuffle it by column "a", while keeping the order in
> column
> "b".
>
> Here is my "not-smart" way of doing it:
>
> # R example
> xx <-data.frame(a= c(1,2,2,3,3,3,4,4,4,4) ,
> b = c(1,1,2,1,2,3,1,2,3,4) )
>
> randomize.by.column.a <- function(xx)
> {
> new.a.order <- sample(unique(xx$a))
> new.xx <- NULL
> for(i in new.a.order)
> {
> xx.subset <- xx[ xx$a %in% i ,]
> new.xx <- rbind(new.xx , xx.subset)
> }
>
> return(new.xx)
> }
> randomize.by.column.a(xx)
> # END of - R example
>
>
>
> I would love for a better, faster, way of doing it.
>
> Thanks,
> Tal
>
>
>
>
>
>
>
>
>
>
> --
> ----------------------------------------------
>
>
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> http://www.r-statistics.com/
> http://www.talgalili.com
> http://www.biostatistics.co.il
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list