[R] Extracting data from dataframe with tied rows

William Dunlap wdunlap at tibco.com
Fri Aug 24 18:54:13 CEST 2012


Or use ave() to compute the within-group ranks (reversed, so max has rank 1) and select
the elements whose ranks are 1:
f2 <- function (DATA) 
{
    stopifnot(is.data.frame(DATA), all(c("distance", "id", "month") %in% 
        names(DATA)))
    revRanks <- ave(DATA[["distance"]], DATA[["id"]], DATA[["month"]], 
        FUN = function(x) rank(-x, ties = "first"))
    DATA[revRanks == 1, ]
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Peter Alspach
> Sent: Thursday, August 23, 2012 4:37 PM
> To: rjb; r-help at r-project.org
> Subject: Re: [R] Extracting data from dataframe with tied rows
> 
> Tena koe John
> 
> One way:
> 
> johnData <- data.frame(id=rep(LETTERS[1:5],20), distance=rnorm(1:100, mean = 100),
> bearing=sample(1:360,100,replace=T), month=sample(1:12,100,replace=T))
> johnAgg <- aggregate(johnData[,'distance'], johnData[,c('id','month')], max)
> names(johnAgg)[3] <- 'distance'
> merge(johnAgg, johnData)
> 
> HTH ....
> 
> Peter Alspach
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of rjb
> Sent: Friday, 24 August 2012 9:19 a.m.
> To: r-help at r-project.org
> Subject: [R] Extracting data from dataframe with tied rows
> 
> Hi R help,
> 
> I'm a fairly experienced R user but this manipulation has me stumped, please
> help:
> 
> DATA
> id<-rep(LETTERS[1:5],20)
> distance<-rnorm(1:100, mean = 100)
> bearing<-sample(1:360,100,replace=T)
> month<-sample(1:12,100,replace=T)
> 
> I have a dataset with records of individuals (id) , each with a distance
> (distance) & direction (bearing) recorded for each month (month).
> I want to find the largest distance per individual per month, which is easy
> with /tapply/ or /melt/cast (reshape)/,
> head(DATA_m<-melt(DATA,id=c("id","month")))
> cast(DATA_m,id+month~.,max)
> OR
> na.omit(melt(tapply(distance,list(id,month),max)))
> 
> *BUT THE CATCH IS* ,
> I also want the the *corresponding*  bearing for that maximum distance per
> month. I've tried the steps above plus using which.max() and loops, but
> can't solve the problem. The real dataset is about 6000 rows.
> 
> I'm guessing the answer is in finding the row number from the original DATA
> but I can't figure how to do that with tapply or melt.
> 
> Any suggestions would be greatly appreciated.
> 
> John Burnside
> 
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Extracting-data-from-
> dataframe-with-tied-rows-tp4641140.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> The contents of this e-mail are confidential and may be subject to legal privilege.
>  If you are not the intended recipient you must not use, disseminate, distribute or
>  reproduce all or any part of this e-mail or attachments.  If you have received this
>  e-mail in error, please notify the sender and delete all material pertaining to this
>  e-mail.  Any opinion or views expressed in this e-mail are those of the individual
>  sender and may not represent those of The New Zealand Institute for Plant and
>  Food Research Limited.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list