[R] identify time span in date vector

Fischer, Felix Felix.Fischer at charite.de
Wed Apr 4 10:51:42 CEST 2012


Dear David, 

thanks for your suggestion. 

However, when applied to 

dates = as.Date(c("2001-1-1", "2001-1-3", "2001-1-12", "2001-1-13", "2001-4-20"))

it doesn't behave like i want... 

> which( dates[4:(length(dates))] -dates[1:(length(dates)-3)] <365 &
       dates[3:(length(dates)-1)] -dates[1:(length(dates)-3)] > 90)

integer(0)

The condition is true for the first element of the vector, there are 4 more dates within one year and one ("2001-4-20") is more than 90 days away.

I came up with the following solution:


identify_first_date = function(dates)
{
within_one_year = as.matrix(dist(dates)) < 366   					### next dates in same year?
within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE

within_one_month = as.matrix(dist(dates)) < 91 					### next dates within 90 days?
within_one_month[upper.tri(within_one_month, diag=TRUE)]=FALSE

dates[
	which(
	apply(within_one_year,2,sum) > apply(within_one_month,2,sum) & 		### more dates in one year than in one month
	apply(within_one_year,2,sum) >=3 						### more than 4 dates in one year
	)[1]]
}

identify_first_date(dates)
[1] "2001-01-01"

However, this takes some time (couple of minutes) with my dataset of 250 000 date vectors.

Best, Felix

-----Ursprüngliche Nachricht-----
Von: David Winsemius [mailto:dwinsemius at comcast.net] 
Gesendet: Dienstag, 3. April 2012 19:08
An: Fischer, Felix
Cc: r-help at r-project.org
Betreff: Re: [R] identify time span in date vector


On Apr 3, 2012, at 9:35 AM, Fischer, Felix wrote:

> Hello everyone,
>
> i try to identify the first element of a date vector, for which the 
> following condition holds: at least 3 more dates within the next 365 
> days, but at least one of these must be between 3-12 month later.
>
> dates = as.Date(sort(rnorm(10,3000,100)), origin = "2000-1-1")
>
> Has anyone an idea how to do this economically? I'll need to apply 
> this to a large dataset with date vectors of various lengths and I can 
> think only of quite difficult algorithms :(
>

which( dates[4:(length(dates))] -dates[1:(length(dates)-3)] <365 &
        dates[3:(length(dates)-1)] -dates[1:(length(dates)-3)] > 90)

[1] 2 3

> Any ideas would be appreciated,
> Felix
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list