[R] finding and describing missing data runs in a time series

R. Michael Weylandt <michael.weylandt@gmail.com> michael.weylandt at gmail.com
Mon Feb 13 03:40:31 CET 2012


Not at a computer to test this but perhaps

rle(is.na(x))

might help. 

Michael

On Feb 12, 2012, at 7:36 PM, "Durant, James T. (ATSDR/DTEM/PRMSB)" <hzd3 at cdc.gov> wrote:

> Hi -
> 
> I am trying to find and describe missing data in a time series. For instance, in the library openair, there is a data frame called "mydata":
> library(openair)
> head(mydata)
> 
>  date   ws  wd nox no2 o3 pm10    so2      co pm25
> 1 1998-01-01 00:00:00 0.60 280 285  39  1   29 4.7225  3.3725   NA
> 2 1998-01-01 01:00:00 2.16 230  NA  NA NA   37     NA      NA   NA
> 3 1998-01-01 02:00:00 2.76 190  NA  NA  3   34 6.8300  9.6025   NA
> 4 1998-01-01 03:00:00 2.16 170 493  52  3   35 7.6625 10.2175   NA
> 5 1998-01-01 04:00:00 2.40 180 468  78  2   34 8.0700  8.9125   NA
> 6 1998-01-01 05:00:00 3.00 190 264  42  0   16 5.5050  3.0525   NA
> 
> 
> So for example, I would like to be able to detect for pm25, I would like to be able to detect that there are NA's starting at 1998-01-01 0:00:00 and runs for 2887 hourly observations.  Then I would be able to know that there is an NA at 2910 and so on. The key information I am looking for is when the NA's start and their length. The closest thing I can use that I know about is timePlot in the openair package with statistic="frequency" but it only gives monthly summary data, and does not tell me if the missing data are clumped together or are dispersed.
> 
> VR
> 
> Jim
> 
> 
> James T. Durant, MSPH CIH
> Emergency Response Coordinator
> US Agency for Toxic Substances and Disease Registry
> Atlanta, GA 30341
> 770-378-1695
> 
> 
> 
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list