[R] conditional selection of dataframe rows

Marc Schwartz marc_schwartz at me.com
Thu Aug 12 21:32:23 CEST 2010


On Aug 12, 2010, at 2:24 PM, Marc Schwartz wrote:

> On Aug 12, 2010, at 2:11 PM, Toby Gass wrote:
> 
>> Dear helpeRs,
>> 
>> I have a dataframe (14947 x 27) containing measurements collected 
>> every 5 seconds at several different sampling locations.  If one 
>> measurement at a given location is less than zero on a given day, I 
>> would like to delete all measurements from that location on that day.
>> 
>> Here is a toy example:
>> 
>> toy <- data.frame(CH = rep(3:5,3), DAY = c(rep(4,5), rep(5,4)), 
>> SLOPE = c(seq(0.2,0.6, .1),seq(0.2, -0.1, -0.1)))
>> 
>> In this example, row 9 has a negative measurement for Chamber 5, so I 
>> would like to delete row 6, which is the same Chamber on the same 
>> day, but not row 3, which is the same chamber on a different day.  In 
>> the full dataframe, there are, of course, many more days.
>> 
>> Is there a handy R way to do this?
>> 
>> Thank you for the assistance.
>> 
>> Toby
> 
> 
> 
> Not fully tested, but here is one possibility:
> 
>> toy
>  CH DAY SLOPE
> 1  3   4   0.2
> 2  4   4   0.3
> 3  5   4   0.4
> 4  3   4   0.5
> 5  4   4   0.6
> 6  5   5   0.2
> 7  3   5   0.1
> 8  4   5   0.0
> 9  5   5  -0.1
> 
> 
>> subset(toy, ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)) == 0)
>  CH DAY SLOPE
> 1  3   4   0.2
> 2  4   4   0.3
> 3  5   4   0.4
> 4  3   4   0.5
> 5  4   4   0.6
> 7  3   5   0.1
> 8  4   5   0.0


This can actually be slightly shortened to:

> subset(toy, !ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)))
  CH DAY SLOPE
1  3   4   0.2
2  4   4   0.3
3  5   4   0.4
4  3   4   0.5
5  4   4   0.6
7  3   5   0.1
8  4   5   0.0


HTH,

Marc



More information about the R-help mailing list