[R] Selection on dataframe based on order of rows
Bonfigli Sandro
bonfigli at inmi.it
Tue Aug 22 20:15:42 CEST 2006
I have a dataframe with the following structure
id date value
-------------------------
1 22/08/2006 48
1 24/08/2006 50
1 28/08/2006 150
1 30/08/2006 100
1 01/09/2006 30
2 11/08/2006 30
2 22/08/2006 100
2 28/08/2006 11
2 02/09/2006 5
3 01/07/2006 3
3 01/08/2006 100
3 01/09/2006 100
4 22/08/2006 48
4 24/08/2006 50
4 28/08/2006 150
4 30/08/2006 100
4 01/09/2006 30
4 03/09/2006 100
4 06/09/2006 100
N.B.: dates in european format; ordered dataframe
For each ID I need to select the first occurrence of
all the rows which are the first of at least two with
"value" >= 50.
Rather convoluted explication. I mean that for each id I have to select
the first row in which value is > 50 only if at least the following row
has "value" > 50 too. If this is not true I repeat the test for all the
following rows in which "value" > 50 untill I find a record that respects
the condition
this means that with my example dataframe the result is :
id date value
-------------------------
1 28/08/2006 150
3 01/08/2006 100
4 28/08/2006 150
It's clear that a for loop would work but I think that that is a better
way.
I tried "by" and could obtain the first row for wich "value" is > 50.
I thought of an iterative process (delete the first row > 50, find the
second row > 50, examine if there are rows in the middle) but it
is quite inelegant as if the first value is not the "good" one I have to
repeat the process for a a priori unknown number of times.
Thanks in advance for Your help
Sandro Bonfigli
More information about the R-help
mailing list