[R] how to convert "sloppy data" into a time series?

David Winsemius dwinsemius at comcast.net
Fri Dec 17 04:57:55 CET 2010


On Dec 16, 2010, at 8:35 PM, Mike Williamson wrote:

> Hi All,
>
>    First let me state that I did search for a while on r-help,  
> google, and
> using the "sos" package inside of 'R', without much luck.  I want to  
> know
> how to create a univariate time series from a set of data that will  
> have
> huge time gaps in it.  For instance, here is a snapshot of a piece  
> of data
> that I would like to analyze:
>
> *Row             queued_time       processTime
> 50  2010-06-15 21:50:42.443 6.399989e-02 secs
> 63  2010-06-15 21:51:57.347 6.300020e-02 secs
> 156 2010-06-29 14:53:26.073 3.011863e+06 secs
> 175 2010-07-22 10:14:57.503 4.334879e+06 secs
> 278 2010-08-05 11:29:56.713 6.155674e+06 secs
> 509 2010-08-05 11:29:57.443 3.120779e+06 secs
> 531 2010-08-05 11:29:57.543 3.120779e+06 secs
> 555 2010-08-05 11:29:57.647 3.120779e+06 secs
> 190 2010-08-05 11:29:57.943 3.120778e+06 secs
> 230 2010-08-05 11:29:58.047 3.120778e+06 secs
> 211 2010-08-05 11:29:58.917 3.120777e+06 secs
> 251 2010-08-05 11:29:59.077 3.120777e+06 secs
> 298 2010-08-05 11:29:59.297 3.120777e+06 secs
> 320 2010-08-05 11:29:59.397 3.120777e+06 secs
> 366 2010-08-05 11:29:59.707 3.120777e+06 secs
> 342 2010-08-05 11:30:00.987 3.120775e+06 secs
> 380 2010-08-05 11:30:01.200 3.120775e+06 secs
> 120 2010-08-19 09:31:47.207 2.358866e+06 secs
> 141 2010-08-19 09:31:47.500 2.358866e+06 secs
> 842 2010-09-03 13:58:21.463 3.641194e+06 secs
> *
>    I would like to be able to take the second column, the  
> "processTime",
> and put it into a time series using the first column as the key to  
> say when
> it occurred.  But everything I could find, such as ts(), went on the
> assumption that I had fully univariate data to start with, and all I  
> needed
> to do was set the frequency & start date (in the case of ts() ).
>    I can adjust the "queued time" arbitrarily as needed, so that if,  
> for
> instance, the data set would end up far too sparse & empty by  
> keeping the
> current precision, I could cut the "queued_time" precision down to  
> just the
> year, month, day, hour.  But in that case, how would the time series  
> handle
> the fact that there are several (varying) entries with the same value
> stored.
>
>    The reason I want to do this is because I next want to be able to  
> use
> all the very nice modeling capabilities that a univariate time series
> allows, such as arima, etc.
>

		Information on package 'its'

Description:

Package:            its
Version:            1.1.8
Date:               2009-09-06
Title:              Irregular Time Series
Author:             Portfolio & Risk Advisory Group, Commerzbank
                     Securities
Maintainer:         Whit Armstrong <armstrong.whit at gmail.com>

-- 
David/
>                                                Thanks in advance!
>                                                            Mike
>
>
>
>
>
>
> "Telescopes and bathyscaphes and sonar probes of Scottish lakes,
> Tacoma Narrows bridge collapse explained with abstract phase-space  
> maps,
> Some x-ray slides, a music score, Minard's Napoleanic war:
> The most exciting frontier is charting what's already here."
>  -- xkcd
>
> --
> Help protect Wikipedia. Donate now:
> http://wikimediafoundation.org/wiki/Support_Wikipedia/en
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list