[R] conditional selection of dataframe rows

Toby Gass tobygass at warnercnr.colostate.edu
Thu Aug 12 23:20:31 CEST 2010


Hi,

I do want to look only at slope.
If there is one negative slope measurement  for a given day and a 
given chamber, I would like to remove all other slope measurements 
for that day and that chamber, even if they are positive.  

On one day, I will have 20 slope measurements for each chamber.  If 
one is negative, I would like to delete the other 19 for that chamber 
on that day, even if they are positive.  I have measurements for 
every day of the year, for 4 years and multiple chambers.  

I know I could make some awful nested loop with a vector of day and 
chamber numbers for each occurrence of a negative slope and then run 
that against the whole data set but I hope not to have to do that.

Here is the rationale, if that helps.  These are unattended outdoor 
chambers that measure soil carbon efflux.  When the numbers go 
negative during part of the day but otherwise look normal, it usually 
means a plant has sprouted in the chamber and is using the carbon 
dioxide.  That means the measurements are all lower than they should 
be and I need to discard all measurements collected on that day, 
whether positive or negative.

It might have been a little clearer if I'd make the toy dataframe a 
bit larger.  

Thanks again for the assistance.

Toby



On 12 Aug 2010 at 16:39, David Winsemius wrote:

> 
> On Aug 12, 2010, at 4:06 PM, Toby Gass wrote:
> 
> > Thank you all for the quick responses.  So far as I've checked,
> > Marc's solution works perfectly and is quite speedy.  I'm still
> > trying to figure out what it is doing. :)
> >
> > Henrique's solution seems to need some columns somewhere.  David's
> > solution does not find all the other measurements, possibly with
> > positive values, taken on the same day.
> 
> I assumed you only wanted to look at what appeared to be a data  
> column, SLOPE. If you want to look at all columns for negatives then  
> try:
> 
> toy[ which( apply(toy, 1, function(x) all(x >= 0)) ), ]  # or
> toy[ apply(toy, 1, function(x) all(x >= 0)) , ]
> 
> This is how they differ w,r,t, their handling of NA's.
> 
>  > toy[3,2] <- NA
>  > toy[ apply(toy, 1, function(x) all(x >= 0)) , ]
>     CH DAY SLOPE
> 1   3   4   0.2
> 2   4   4   0.3
> NA NA  NA    NA
> 4   3   4   0.5
> 5   4   4   0.6
> 6   5   5   0.2
> 7   3   5   0.1
> 8   4   5   0.0
>  > toy[ which(apply(toy, 1, function(x) all(x >= 0)) ), ]
>    CH DAY SLOPE
> 1  3   4   0.2
> 2  4   4   0.3
> 4  3   4   0.5
> 5  4   4   0.6
> 6  5   5   0.2
> 7  3   5   0.1
> 8  4   5   0.0
> 
> 
> >
> > Thank you again for your efforts.
> >
> > Toby
> >
> > On 12 Aug 2010 at 14:32, Marc Schwartz wrote:
> >
> >> On Aug 12, 2010, at 2:24 PM, Marc Schwartz wrote:
> >>
> >>> On Aug 12, 2010, at 2:11 PM, Toby Gass wrote:
> >>>
> >>>> Dear helpeRs,
> >>>>
> >>>> I have a dataframe (14947 x 27) containing measurements collected
> >>>> every 5 seconds at several different sampling locations.  If one
> >>>> measurement at a given location is less than zero on a given day, I
> >>>> would like to delete all measurements from that location on that  
> >>>> day.
> >>>>
> >>>> Here is a toy example:
> >>>>
> >>>> toy <- data.frame(CH = rep(3:5,3), DAY = c(rep(4,5), rep(5,4)),
> >>>> SLOPE = c(seq(0.2,0.6, .1),seq(0.2, -0.1, -0.1)))
> >>>>
> >>>> In this example, row 9 has a negative measurement for Chamber 5,  
> >>>> so I
> >>>> would like to delete row 6, which is the same Chamber on the same
> >>>> day, but not row 3, which is the same chamber on a different  
> >>>> day.  In
> >>>> the full dataframe, there are, of course, many more days.
> >>>>
> >>>> Is there a handy R way to do this?
> >>>>
> >>>> Thank you for the assistance.
> >>>>
> >>>> Toby
> >>>
> >>>
> >>>
> >>> Not fully tested, but here is one possibility:
> >>>
> >>>> toy
> >>> CH DAY SLOPE
> >>> 1  3   4   0.2
> >>> 2  4   4   0.3
> >>> 3  5   4   0.4
> >>> 4  3   4   0.5
> >>> 5  4   4   0.6
> >>> 6  5   5   0.2
> >>> 7  3   5   0.1
> >>> 8  4   5   0.0
> >>> 9  5   5  -0.1
> >>>
> >>>
> >>>> subset(toy, ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)) == 0)
> >>> CH DAY SLOPE
> >>> 1  3   4   0.2
> >>> 2  4   4   0.3
> >>> 3  5   4   0.4
> >>> 4  3   4   0.5
> >>> 5  4   4   0.6
> >>> 7  3   5   0.1
> >>> 8  4   5   0.0
> >>
> >>
> >> This can actually be slightly shortened to:
> >>
> >>> subset(toy, !ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)))
> >>  CH DAY SLOPE
> >> 1  3   4   0.2
> >> 2  4   4   0.3
> >> 3  5   4   0.4
> >> 4  3   4   0.5
> >> 5  4   4   0.6
> >> 7  3   5   0.1
> >> 8  4   5   0.0
> >>
> >>
> >> HTH,
> >>
> >> Marc
> >>
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> West Hartford, CT
>



More information about the R-help mailing list