[R] "Holes" in a data frame with time intervals
jim holtman
jholtman at gmail.com
Mon Mar 28 14:29:42 CEST 2011
Does this do what you want:
> x <- read.table(textConnection("B1 A15 30 2001-01-01 2001-10-15
+ B1 A15 28 2001-10-16 2001-12-31
+ B1 A15 32 2002-01-01 2003-04-18
+ B1 A15 33 2003-04-19 2004-12-31
+ B1 A15 29 2005-03-01 2010-12-31
+ B1 A15 30 2011-02-12 9999-12-31"))
> closeAllConnections()
> x$V4 <- as.Date(x$V4)
> x$V5 <- as.Date(x$V5)
> # create dataframe with intervals
> y <- rbind(cbind(x$V4, 1), cbind(x$V5 + 1, -1))
> y <- y[order(y[,1]),]
> y <- cbind(y, count = cumsum(y[,2]))
> y
count
[1,] 11323 1 1
[2,] 11611 1 2
[3,] 11611 -1 1
[4,] 11688 1 2
[5,] 11688 -1 1
[6,] 12161 1 2
[7,] 12161 -1 1
[8,] 12784 -1 0
[9,] 12843 1 1
[10,] 14975 -1 0
[11,] 15017 1 1
[12,] 2932897 -1 0
> # find counts of zero to determine intervals
> missing <- which(y[,'count'] == 0)
> # remove any index equal to last one
> missing <- missing[missing != nrow(y)]
> # print intervals
> paste(as.Date(y[missing,1], origin = '1970-1-1')
+ , 'to'
+ , as.Date(y[missing + 1, 1] - 1, origin = '1970-1-1')
+ )
[1] "2005-01-01 to 2005-02-28" "2011-01-01 to 2011-02-11"
>
>
On Mon, Mar 28, 2011 at 7:56 AM, <ANGELO.LINARDI at bancaditalia.it> wrote:
> Good morning,
>
>
>
> I am facing a problem very easy to solve with a program, but not too
> easy (at least IMHO) with a "declarative" approach.
>
> I have a dataframe df with some information about bank branches with a
> validity time associated (start date/end date, format YYYY-MM-DD) to
> some attributes (for example number of employees assigned).
>
>
>
> The following example will clarify this description:
>
>
>
> BANK_ID BRANCH_ID EMPLOYEE # STARTDATE ENDDATE
>
> B1 A15 30 2001-01-01 2001-10-15
>
> B1 A15 28 2001-10-16 2001-12-31
>
> B1 A15 32 2002-01-01 2003-04-18
>
> B1 A15 33 2003-04-19 2004-12-31
>
> B1 A15 29 2005-03-01 2010-12-31
>
> B1 A15 30 2011-02-12 9999-12-31
>
> ........................................................................
> ........................................................................
> .....................
>
>
>
> I would like to find the "missing time intervals" ("holes" - in the
> example 2005-01-01 to 2005-02-28 and from 2011-01-01 to 2011-02-11).
>
> The "programmer's way" would be:
>
>
>
> * Sort the data by "key" + start date
>
> * For each occurrence add 1 day to end date and compare the
> result with the start date of the following occurrence
>
>
>
> Can someone help me in finding a "declarative" way to do it ?
>
>
>
> Thank you in advance
>
>
>
> Angelo Linardi
>
>
>
>
>
>
>
>
>
>
> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
> comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non
> sia espressamente previsto da un accordo scritto.
> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di
> comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La
> informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati
> potrebbe costituire reato. Grazie per la collaborazione.
> -- E-mails from the Bank of Italy are sent in good faith but they are neither binding on
> the Bank nor to be understood as creating any obligation on its part except where
> provided for in a written agreement. This e-mail is confidential. If you have received it
> by mistake, please inform the sender by reply e-mail and delete it from your system.
> Please also note that the unauthorized disclosure or use of the message or any
> attachments could be an offence. Thank you for your cooperation. **
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
More information about the R-help
mailing list