[R] identify time span in date vector
Petr PIKAL
petr.pikal at precheza.cz
Wed Apr 4 14:19:17 CEST 2012
Hi
>
> Dear Petr,
>
> thanks for taking your time.
>
> For this input, the first element should be selected since there are
more
> than 3 more dates within one year (basically, all other dates are within
> one year) and at least one of them is more than 3 month later.
>
> In the meantime, I came up with some code (probably) doing what I want:
>
> identify_first_date = function(dates)
> {
> within_one_year = as.matrix(dist(dates)) < 366 ### next
> dates in same year?
> within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE
>
> within_one_month = as.matrix(dist(dates)) < 91 ### next
> dates within 90 days?
> within_one_month[upper.tri(within_one_month, diag=TRUE)]=FALSE
>
> dates[
> which(
> apply(within_one_year,2,sum) > apply(within_one_month,2,sum) &
> ### more dates in one year than in one month
> apply(within_one_year,2,sum) >=3 ### more than 4
> dates in one year
> )[1]]
> }
>
> I guess, the code could be improved, though, it takes some time.
Your first condition can be fulfilled by
c(as.numeric(diff(dates))<365, F) > c(as.numeric(diff(dates))<91,F))
so if you put in your function
identify_first_date2 = function(dates)
{
within_one_year = as.matrix(dist(dates)) < 366
within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE
distance<-as.numeric(diff(dates))
dates[ which( c(distance<365, F) > c(distance<91,F) &
apply(within_one_year,2,sum) >=3)[1]]
}
You shall get some improvement, however I am still struggling to evaluate
how many consecutive dates are within one year.
>
> Best,
> Felix
>
>
> -----Ursprüngliche Nachricht-----
> Von: Petr PIKAL [mailto:petr.pikal at precheza.cz]
> Gesendet: Mittwoch, 4. April 2012 09:47
> An: Fischer, Felix
> Cc: r-help at r-project.org
> Betreff: Odp: [R] identify time span in date vector
>
> Hi
>
> Can you please be more specific? Based on this input, what do you want
as a result?
>
> > set.seed(111)
> > dates = as.Date(sort(rnorm(10,3000,100)), origin = "2000-1-1") dates
> [1] "2007-08-01" "2007-10-21" "2007-12-08" "2007-12-15" "2008-01-29"
> "2008-02-14" "2008-02-16" "2008-03-01"
> [9] "2008-04-02" "2008-04-11"
> >
>
> Regards
> Petr
>
> >
> > Hello everyone,
> >
> > i try to identify the first element of a date vector, for which the
> > following condition holds: at least 3 more dates within the next 365
> days,
> > but at least one of these must be between 3-12 month later.
> >
> > dates = as.Date(sort(rnorm(10,3000,100)), origin = "2000-1-1")
> >
> > Has anyone an idea how to do this economically? I'll need to apply
> > this
> to
> > a large dataset with date vectors of various lengths and I can think
> only
> > of quite difficult algorithms :(
> >
> > Any ideas would be appreciated,
> > Felix
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list