[R] Importing Time Series Data for an R Beginner
Cedrick W. Johnson
cedrick at cedrickjohnson.com
Thu Mar 11 21:43:49 CET 2010
Actually I just learned something myself that you can do on the dataset
*without* the additional step in Excel.. I changed the format in
strptime to match the format (d'oh!!!!) and whala:
x
Subject Date Time Value
1 1 7/23/2003 13:05:00 84
2 1 7/23/2003 13:10:00 87
3 1 7/23/2003 13:15:00 95
4 2 9/25/2004 14:34:00 95
5 2 9/25/2004 14:39:00 81
6 2 9/25/2004 14:44:00 93
7 3 3/2/2004 16:34:00 72
8 3 3/2/2004 16:39:00 67
9 3 3/2/2004 16:44:00 83
> dates = as.POSIXct(strptime(paste(x[,2], x[,3], sep=" "),
format="%m/%d/%Y %H:%M:%S"))
> dates
[1] "2003-07-23 13:05:00 EDT" "2003-07-23 13:10:00 EDT" "2003-07-23
13:15:00 EDT"
[4] "2004-09-25 14:34:00 EDT" "2004-09-25 14:39:00 EDT" "2004-09-25
14:44:00 EDT"
[7] "2004-03-02 16:34:00 EST" "2004-03-02 16:39:00 EST" "2004-03-02
16:44:00 EST"
> data = xts(x[,c(1,4)], order.by=dates)
> data
Subject Value
2003-07-23 13:05:00 1 84
2003-07-23 13:10:00 1 87
2003-07-23 13:15:00 1 95
2004-03-02 16:34:00 3 72
2004-03-02 16:39:00 3 67
2004-03-02 16:44:00 3 83
2004-09-25 14:34:00 2 95
2004-09-25 14:39:00 2 81
2004-09-25 14:44:00 2 93
hth,
c
ps: my first message didn't make it to the list... apparently i had a
bad header??
=============================
Cedrick W. Johnson
aolim) cedrickjcvgr
www.cedrickjohnson.com
New York - Chicago
On 3/11/2010 3:34 PM, Cedrick W. Johnson (CJ) wrote:
> Hi Clay-
>
> You may want to look at both the XTS package, in addition to 'strptime'
> and 'as.POSIXct'
>
> When I get datasets in Excel, what I normally do is change the date
> (column) format to YYYY-mm-dd.. But that's due to my own shortcomings
> with date formatting in R.
>
> Here's a quick example:
>
> > x = read.csv('TestData.csv')
> > x
> Subject Date Time Value
> 1 1 2003-07-23 13:05:00 84
> 2 1 2003-07-23 13:10:00 87
> 3 1 2003-07-23 13:15:00 95
> 4 2 2004-09-25 14:34:00 95
> 5 2 2004-09-25 14:39:00 81
> 6 2 2004-09-25 14:44:00 93
> 7 3 2004-03-02 16:34:00 72
> 8 3 2004-03-02 16:39:00 67
> 9 3 2004-03-02 16:44:00 83
>
> dates = as.POSIXct(strptime(paste(x[,2], x[,3], sep=" "),
> format="%Y-%m-%d %H:%M:%S"))
>
>
> > dates
> [1] "2003-07-23 13:05:00 EDT" "2003-07-23 13:10:00 EDT" "2003-07-23
> 13:15:00 EDT"
> [4] "2004-09-25 14:34:00 EDT" "2004-09-25 14:39:00 EDT" "2004-09-25
> 14:44:00 EDT"
> [7] "2004-03-02 16:34:00 EST" "2004-03-02 16:39:00 EST" "2004-03-02
> 16:44:00 EST"
>
> > data = xts(x[,c(1,4)], order.by=dates)
> > data
> Subject Value
> 2003-07-23 13:05:00 1 84
> 2003-07-23 13:10:00 1 87
> 2003-07-23 13:15:00 1 95
> 2004-03-02 16:34:00 3 72
> 2004-03-02 16:39:00 3 67
> 2004-03-02 16:44:00 3 83
> 2004-09-25 14:34:00 2 95
> 2004-09-25 14:39:00 2 81
> 2004-09-25 14:44:00 2 93
>
>
> HTH
>
> -cedrick
>
> =============================
> Cedrick Johnson
> aolim) cedrickjcvgr
> www.cedrickjohnson.com
> New York - Chicago
>
>
> On 3/11/2010 3:13 PM, Clay Heaton wrote:
>> Hi, I'm trying to learn R for a project I'm working on. I know several
>> programming languages, so I'm comfortable with the syntax. What I
>> can't figure out is how to import the file of time series data that I
>> have and parse it into individual series. The data was given to me in
>> Excel, but I can output it to tab-delimited or csv. I've been able to
>> pull in the entire table with read.table(), but I can't figure out how
>> to parse it into distinct groups.
>>
>> It looks like this:
>>
>> Subject Date Time Value
>> 1 7/23/03 13:05:00 84
>> 1 7/23/03 13:10:00 87
>> 1 7/23/03 13:15:00 95
>> ....
>> 1 9/25/04 14:34:00 95
>> 1 9/25/04 14:39:00 81
>> 1 9/25/04 14:44:00 93
>> ...
>> 2 3/02/04 16:34:00 72
>> 2 3/02/04 16:39:00 67
>> 2 3/02/04 16:44:00 83
>> ...
>> 2 3/21/05 11:15:00 121
>> 2 3/21/05 11:20:00 125
>> 2 3/21/05 11:25:00 120
>> ...
>>
>> There are ~ 100,000 rows of data. There are 86 subjects and each of
>> them have multiple traces. For each trace, the times are in uniform
>> increments of 5 minutes. Some subjects have multiple traces, some have
>> a single trace. Some traces include up to 500 values and others only 40.
>>
>> For now, what I'm looking to do is to be able to generate summary
>> statistics for each trace, and then for each subject. Hence, I need a
>> way to aggregate by value or subject, where the criteria for
>> aggregating traces are that the values were collected on the same day
>> and all are within 5 minutes of each other. I would like to be able to
>> iterate through the data to plot each trace independently.
>>
>> Any suggestions to help me get started would be appreciated. I'm
>> looking to learn, so I'd appreciate pointers to good tutorials or code
>> examples of dealing with time series data.
>>
>> Thanks!
>> Clay
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list