[R] extracting data coincident with the beginning and end of multiple streaks (rle)
arun
smartpink111 at yahoo.com
Sat Jun 22 01:23:40 CEST 2013
Hi,
May be this helps:
###Added more lines of fake data
fn_hp<- read.table(text="
id date wl_m wet cuml_day
585 fn 2012-03-03 0.1527048 1 1
586 fn 2012-03-04 0.2121408 1 2
587 fn 2012-03-05 0.1877568 1 3
588 fn 2012-03-06 0.1709928 1 4
589 fn 2012-03-07 0.1642872 1 5
598 fn 2012-03-16 0.0182880 0 1
599 fn 2012-03-17 -0.0076200 0 2
600 fn 2012-03-18 -0.0067056 0 3
601 fn 2012-03-19 -0.0097536 0 4
602 fn 2012-03-20 0.0015240 0 5
603 fn 2012-03-21 -0.0067056 0 6
604 fn 2012-03-22 0.0003048 0 7
605 fn 2012-03-23 0.0024384 0 8
606 fn 2012-03-24 -0.0054864 0 9
607 fn 2012-03-25 -0.0067056 1 1
608 fn 2012-03-26 0.0003048 1 2
609 fn 2012-03-27 0.0024384 1 3
610 fn 2012-03-28 -0.0054864 1 4
",sep="",header=TRUE,stringsAsFactors=FALSE)
fn_hp1<- fn_hp
fn_hp$DESIRED.col<-NA
fn_hp$IDNew<- cumsum(c(1,abs(diff(fn_hp$cuml_day)))>1)+1
res1<- unsplit(lapply(split(fn_hp,fn_hp$IDNew),function(x){ x$DESIRED.col[1]<-tail(x$cuml_day,1);x$DESIRED.col[nrow(x)]<- x$DESIRED.col[1];x}),fn_hp$IDNew)[,-7]
res1[!is.na(res1$DESIRED.col),]
# id date wl_m wet cuml_day DESIRED.col
#585 fn 2012-03-03 0.1527048 1 1 5
#589 fn 2012-03-07 0.1642872 1 5 5
#598 fn 2012-03-16 0.0182880 0 1 9
#606 fn 2012-03-24 -0.0054864 0 9 9
#607 fn 2012-03-25 -0.0067056 1 1 4
#610 fn 2012-03-28 -0.0054864 1 4 4
#or
fn_hp1$IDNew<-cumsum(c(1,abs(diff(fn_hp1$cuml_day)))>1)+1
library(plyr)
res2<-ddply(fn_hp1,.(IDNew),mutate,DESIRED.col=c(tail(cuml_day,1),rep(NA,length(cuml_day)-2),tail(cuml_day,1)))[,-6]
row.names(res2)<- row.names(fn_hp1)
res2[!is.na(res2$DESIRED.col),-6]
# id date wl_m wet cuml_day
#585 fn 2012-03-03 0.1527048 1 1
#589 fn 2012-03-07 0.1642872 1 5
#598 fn 2012-03-16 0.0182880 0 1
#606 fn 2012-03-24 -0.0054864 0 9
#607 fn 2012-03-25 -0.0067056 1 1
#610 fn 2012-03-28 -0.0054864 1 4
#or
#if the `DESIRED.col` is not needed
res3<- ddply(fn_hp1,.(IDNew),function(x) x[c(1,nrow(x)),])[,-6]
res3
# id date wl_m wet cuml_day
#1 fn 2012-03-03 0.1527048 1 1
#2 fn 2012-03-07 0.1642872 1 5
#3 fn 2012-03-16 0.0182880 0 1
#4 fn 2012-03-24 -0.0054864 0 9
#5 fn 2012-03-25 -0.0067056 1 1
#6 fn 2012-03-28 -0.0054864 1 4
A.K.
Good day:
I used rle to calculate the wet and dry duration (cuml_day) of wetlands using the "wet" variable from the sample data below.
>cum_day<- unlist( lapply( rle(fn_hp$wet)$lengths, seq_len)) ### counts consecutive 1 and 0 ###
>fn_hp<-cbind(fn_hp,cum_day) ### bind cumul. days to org dataframe
I would now like to extract the rows of data that correspond to
the beginning and end of each streak so I can look at both the duration
of the streak and the date ranges where it occurred (to see if wet
periods coincide with amphibian breeding periods).
- An alternative solution would be to add the streak length from rle to each row that was included in the particular streak (DESIRED.col)
I am a relatively new R user and not sure the best way to approach this. Any insight is appreciated.
-Jeff
id date wl_m wet cuml_day DESIRED.col
585 fn 2012-03-03 0.1527048 1 1 5
586 fn 2012-03-04 0.2121408 1 2 .
587 fn 2012-03-05 0.1877568 1 3 .
588 fn 2012-03-06 0.1709928 1 4 .
589 fn 2012-03-07 0.1642872 1 5 5
598 fn 2012-03-16 0.0182880 0 1 9
599 fn 2012-03-17 -0.0076200 0 2 .
600 fn 2012-03-18 -0.0067056 0 3 .
601 fn 2012-03-19 -0.0097536 0 4 .
602 fn 2012-03-20 0.0015240 0 5 .
603 fn 2012-03-21 -0.0067056 0 6 .
604 fn 2012-03-22 0.0003048 0 7 .
605 fn 2012-03-23 0.0024384 0 8 .
606 fn 2012-03-24 -0.0054864 0 9 9
More information about the R-help
mailing list