[R] conditional selection of dataframe rows

David Winsemius dwinsemius at comcast.net
Thu Aug 12 22:39:53 CEST 2010


On Aug 12, 2010, at 4:06 PM, Toby Gass wrote:

> Thank you all for the quick responses.  So far as I've checked,
> Marc's solution works perfectly and is quite speedy.  I'm still
> trying to figure out what it is doing. :)
>
> Henrique's solution seems to need some columns somewhere.  David's
> solution does not find all the other measurements, possibly with
> positive values, taken on the same day.

I assumed you only wanted to look at what appeared to be a data  
column, SLOPE. If you want to look at all columns for negatives then  
try:

toy[ which( apply(toy, 1, function(x) all(x >= 0)) ), ]  # or
toy[ apply(toy, 1, function(x) all(x >= 0)) , ]

This is how they differ w,r,t, their handling of NA's.

 > toy[3,2] <- NA
 > toy[ apply(toy, 1, function(x) all(x >= 0)) , ]
    CH DAY SLOPE
1   3   4   0.2
2   4   4   0.3
NA NA  NA    NA
4   3   4   0.5
5   4   4   0.6
6   5   5   0.2
7   3   5   0.1
8   4   5   0.0
 > toy[ which(apply(toy, 1, function(x) all(x >= 0)) ), ]
   CH DAY SLOPE
1  3   4   0.2
2  4   4   0.3
4  3   4   0.5
5  4   4   0.6
6  5   5   0.2
7  3   5   0.1
8  4   5   0.0


>
> Thank you again for your efforts.
>
> Toby
>
> On 12 Aug 2010 at 14:32, Marc Schwartz wrote:
>
>> On Aug 12, 2010, at 2:24 PM, Marc Schwartz wrote:
>>
>>> On Aug 12, 2010, at 2:11 PM, Toby Gass wrote:
>>>
>>>> Dear helpeRs,
>>>>
>>>> I have a dataframe (14947 x 27) containing measurements collected
>>>> every 5 seconds at several different sampling locations.  If one
>>>> measurement at a given location is less than zero on a given day, I
>>>> would like to delete all measurements from that location on that  
>>>> day.
>>>>
>>>> Here is a toy example:
>>>>
>>>> toy <- data.frame(CH = rep(3:5,3), DAY = c(rep(4,5), rep(5,4)),
>>>> SLOPE = c(seq(0.2,0.6, .1),seq(0.2, -0.1, -0.1)))
>>>>
>>>> In this example, row 9 has a negative measurement for Chamber 5,  
>>>> so I
>>>> would like to delete row 6, which is the same Chamber on the same
>>>> day, but not row 3, which is the same chamber on a different  
>>>> day.  In
>>>> the full dataframe, there are, of course, many more days.
>>>>
>>>> Is there a handy R way to do this?
>>>>
>>>> Thank you for the assistance.
>>>>
>>>> Toby
>>>
>>>
>>>
>>> Not fully tested, but here is one possibility:
>>>
>>>> toy
>>> CH DAY SLOPE
>>> 1  3   4   0.2
>>> 2  4   4   0.3
>>> 3  5   4   0.4
>>> 4  3   4   0.5
>>> 5  4   4   0.6
>>> 6  5   5   0.2
>>> 7  3   5   0.1
>>> 8  4   5   0.0
>>> 9  5   5  -0.1
>>>
>>>
>>>> subset(toy, ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)) == 0)
>>> CH DAY SLOPE
>>> 1  3   4   0.2
>>> 2  4   4   0.3
>>> 3  5   4   0.4
>>> 4  3   4   0.5
>>> 5  4   4   0.6
>>> 7  3   5   0.1
>>> 8  4   5   0.0
>>
>>
>> This can actually be slightly shortened to:
>>
>>> subset(toy, !ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)))
>>  CH DAY SLOPE
>> 1  3   4   0.2
>> 2  4   4   0.3
>> 3  5   4   0.4
>> 4  3   4   0.5
>> 5  4   4   0.6
>> 7  3   5   0.1
>> 8  4   5   0.0
>>
>>
>> HTH,
>>
>> Marc
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list