[R-sig-eco] Subset dataframe

Patrick Kilduff dpkilduff at yahoo.com
Tue Apr 16 18:59:13 CEST 2013


Hi Jade,

One way that you can do this is using the 'sqldf' package.

LRVS_cpue <- rnorm(100)*10
outing_ID <- rep(c("51801", "51802", "51803", "51804", "51805"), each = 20)

cpueData <- data.frame(outing_ID = outing_ID, LRVS_cpue = LRVS_cpue)

library(sqldf)
# This returns only the max values for each outing_ID and drops all the 
other rows. It sounds
# like you may want the rows with out the max values.
# Also, you'll want to check on how missing values might influence your 
query.
sqldf("SELECT outing_ID, max(LRVS_cpue) FROM cpueData GROUP BY outing_ID 
ORDER BY outing_ID;" )

Hope this helps,
Patrick

On 4/16/13 5:12 AM, Jade Maggs wrote:
> Hi list, I need to subset the dataframe below by selecting rows with maximum
> LRVS_cpue values for each outing_ID. For example, where outing_ID == 51801,
> the new dataframe should have only one row with LRVS_cpue = 0.5. LRVS_cpue
> in all other rows should remain as 0. I have over 650 000 rows, so looping
> is very slow.
>
>   
>
> I have tried: >cpueData1 <-
> data.frame(unique(cpueData[max(cpueData$LRVS_cpue),])) but this does not
> work.
>
>   
>
> Any help would be greatly appreciated.
>
>   
>
>   
>
>
> patrol_ID
>
> outing_ID
>
> num_anglers
>
> hours_fish
>
> ang_hours
>
> LRVS_cpue
>
>
> 51709
>
> 51795
>
> 2
>
> 3.5
>
> 7
>
> 0
>
>
> 51709
>
> 51796
>
> 1
>
> 0.5
>
> 0.5
>
> 0
>
>
> 51709
>
> 51797
>
> 1
>
> 1
>
> 1
>
> 0
>
>
> 51709
>
> 51798
>
> 1
>
> 2
>
> 2
>
> 0
>
>
> 51709
>
> 51799
>
> 5
>
> 5.5
>
> 27.5
>
> 0
>
>
> 51709
>
> 51800
>
> 1
>
> 3
>
> 3
>
> 0
>
>
> 51709
>
> 51801
>
> 2
>
> 1
>
> 2
>
> 0
>
>
> 51709
>
> 51801
>
> 2
>
> 1
>
> 2
>
> 0.5
>
>
> 51709
>
> 51802
>
> 1
>
> 1.5
>
> 1.5
>
> 0
>
>
> 51709
>
> 51803
>
> 3
>
> 1
>
> 3
>
> 0
>
>
> 51709
>
> 51804
>
> 4
>
> 1
>
> 4
>
> 0
>
>   
>
> JADE MAGGS
>
> Assistant Scientist
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



More information about the R-sig-ecology mailing list