[R] Importing Time Series Data for an R Beginner

Cedrick W. Johnson (CJ) cedrick at cedrickjohnson.com
Thu Mar 11 21:34:37 CET 2010


Hi Clay-

You may want to look at both the XTS package, in addition to 'strptime' 
and 'as.POSIXct'

When I get datasets in Excel, what I normally do is change the date 
(column) format to YYYY-mm-dd.. But that's due to my own shortcomings 
with date formatting in R.

Here's a quick example:

 > x = read.csv('TestData.csv')
 > x
   Subject       Date     Time Value
1       1 2003-07-23 13:05:00    84
2       1 2003-07-23 13:10:00    87
3       1 2003-07-23 13:15:00    95
4       2 2004-09-25 14:34:00    95
5       2 2004-09-25 14:39:00    81
6       2 2004-09-25 14:44:00    93
7       3 2004-03-02 16:34:00    72
8       3 2004-03-02 16:39:00    67
9       3 2004-03-02 16:44:00    83

dates = as.POSIXct(strptime(paste(x[,2], x[,3], sep=" "), 
format="%Y-%m-%d %H:%M:%S"))


 > dates
[1] "2003-07-23 13:05:00 EDT" "2003-07-23 13:10:00 EDT" "2003-07-23 
13:15:00 EDT"
[4] "2004-09-25 14:34:00 EDT" "2004-09-25 14:39:00 EDT" "2004-09-25 
14:44:00 EDT"
[7] "2004-03-02 16:34:00 EST" "2004-03-02 16:39:00 EST" "2004-03-02 
16:44:00 EST"

 > data = xts(x[,c(1,4)], order.by=dates)
 > data
                     Subject Value
2003-07-23 13:05:00       1    84
2003-07-23 13:10:00       1    87
2003-07-23 13:15:00       1    95
2004-03-02 16:34:00       3    72
2004-03-02 16:39:00       3    67
2004-03-02 16:44:00       3    83
2004-09-25 14:34:00       2    95
2004-09-25 14:39:00       2    81
2004-09-25 14:44:00       2    93


HTH

-cedrick

=============================
Cedrick Johnson
aolim) cedrickjcvgr
www.cedrickjohnson.com
New York - Chicago


On 3/11/2010 3:13 PM, Clay Heaton wrote:
> Hi, I'm trying to learn R for a project I'm working on. I know several programming languages, so I'm comfortable with the syntax. What I can't figure out is how to import the file of time series data that I have and parse it into individual series.  The data was given to me in Excel, but I can output it to tab-delimited or csv. I've been able to pull in the entire table with read.table(), but I can't figure out how to parse it into distinct groups.
>
> It looks like this:
>
> Subject     Date    Time   Value
> 1     7/23/03  13:05:00   84
> 1     7/23/03  13:10:00   87
> 1     7/23/03  13:15:00   95
> ....
> 1     9/25/04  14:34:00   95
> 1     9/25/04  14:39:00   81
> 1     9/25/04  14:44:00   93
> ...
> 2     3/02/04  16:34:00   72
> 2     3/02/04  16:39:00   67
> 2     3/02/04  16:44:00   83
> ...
> 2     3/21/05  11:15:00   121
> 2     3/21/05  11:20:00   125
> 2     3/21/05  11:25:00   120
> ...
>
> There are ~ 100,000 rows of data. There are 86 subjects and each of them have multiple traces. For each trace, the times are in uniform increments of 5 minutes. Some subjects have multiple traces, some have a single trace. Some traces include up to 500 values and others only 40.
>
> For now, what I'm looking to do is to be able to generate summary statistics for each trace, and then for each subject. Hence, I need a way to aggregate by value or subject, where the criteria for aggregating traces are that the values were collected on the same day and all are within 5 minutes of each other. I would like to be able to iterate through the data to plot each trace independently.
>
> Any suggestions to help me get started would be appreciated. I'm looking to learn, so I'd appreciate pointers to good tutorials or code examples of dealing with time series data.
>
> Thanks!
> Clay
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list