[R] split-apply question

William Dunlap wdunlap at tibco.com
Fri Oct 2 19:51:20 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of hadley wickham
> Sent: Friday, October 02, 2009 6:07 AM
> To: jim holtman
> Cc: r-help at r-project.org; Kavitha Venkatesan
> Subject: Re: [R] split-apply question
> 
> On Fri, Oct 2, 2009 at 4:24 AM, jim holtman 
> <jholtman at gmail.com> wrote:
> > try this:
> >
> >> x <- read.table(textConnection("x1  x2  x3
> > + A   1    1.5
> > + B   2    0.9
> > + B   3    2.7
> > + C   7    1.8
> > + D   7    1.3"), header=TRUE)
> >> closeAllConnections()
> >> do.call(rbind, lapply(split(seq(nrow(x)), x$x1), function(.row){
> > +     x[.row[which.min(x$x2[.row])],]
> > + }))
> >  x1 x2  x3
> > A  A  1 1.5
> > B  B  2 0.9
> > C  C  7 1.8
> > D  D  7 1.3
> >>
> 
> Or, using plyr and subset
> 
> library(plyr)
> ddply(x, "x1", subset, x2 == min(x2))
> 
> Hadley

Since we are using min() we can use sorting tricks

f3 <- function(x) {
   x <- x[with(x, order(x1,x2)),]
   isFirstInRun <- function(z)c(TRUE, z[-1] != z[-length(z)])
   x[isFirstInRun(x$x1),]
}

This has the advantage that it keeps the original row names intact.
It is quick even when there are lots of unique values in x1.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  
> -- 
> http://had.co.nz/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list