[R] Questions about doing analysis based on time

APOCooter mikeedinger16 at gmail.com
Fri Jun 22 18:21:40 CEST 2012


Man, R has a steep learning curve (but I suppose you all know this).  I have
very little programming knowledge, so when I search for answers to my
questions, I struggle with making sense of a lot of the pages.

I have a spreadsheet that I've read into R using read.csv.  I've also
attached it.  It looks like this (except there are 1600+ entries):

> Sunday
              SunDate SunTime SunScore
1       5/9/2010 0:00    0:00      127
2      6/12/2011 0:00    0:00      125
3      6/15/2008 0:04    0:04       98
4       8/3/2008 0:07    0:07      118
5      7/24/2011 0:07    0:07      122
6      5/25/2008 0:09    0:09      104
7      5/20/2012 0:11    0:11      124
8     10/18/2009 0:12    0:12      121
9      3/14/2010 0:12    0:12      117
10      1/2/2011 0:12    0:12      131

SunDate and SunTime are both factors.  In order to change the class to
something I can work with, I use the following:

Sunday$SunTime<-as.POSIXlt(SunTime,tz=””,”%H:%M”)
Sunday$SunDate<-as.POSIXlt(SunDate,tz=””,”%m/%d/%Y %H:%M”)

Now, the str(Sunday) command yields:

'data.frame':   1644 obs. of  3 variables:
 $ SunDate : POSIXlt, format: "2010-05-09 00:00:00" "2011-06-12 00:00:00"
...
 $ SunTime : POSIXlt, format: "2012-06-18 00:00:00" "2012-06-18 00:00:00"
...
 $ SunScore: int  127 125 98 118 122 104 124 121 117 131 ...

I think all the elements in Sunday are correct for me to do what I want to
do, but I don't know how to do them.

1. How can I get the mean score by hour?  For example, I want the mean score
of all the entries between 0:00 and 0:59, then 1:00  and 1:59, etc.
2. Is it possible for me to create a histogram by hour for each score over a
certain point?  For example, I want to make a histogram of all scores above
140 by the hour they occurred in.  Is that possible?

These last few might not be possibe (at least with R), but I'll ask anyway. 
I've got another data set similar to the one above, except it's got 12,000
entries over four years.  If I do the same commands as above to turn Date
and Time into POSIXlt, is it possible for me to do the following:

1. The data was recorded at irregular intervals, and the difference between
recorded points can range from anywhere between 1 hour and up to 7.  Is it
possible, when data isn't recorded between two points, to insert the hours
that are unrecorded along with the average of what that hour is.  This is
sort of a pre-requisite for the next two.
2. If one of the entries has a Score above a certain point, is it possible
to determine how long it was above that point and determine the mean for all
the instances this occurred.  For example:
01/01/11 01:00 AM
101
01/01/11 02:21 AM
142
01/01/11 03:36 AM
156
01/01/11 04:19 AM
130
01/01/11 05:12 AM
146
01/01/11 06:49 AM
116
01/01/11 07:09 AM
111
	There are two spans where it's above 140. The two and three o'clock hours,
and the 5 o'clock hour.  So the mean time would be 1.5 hours.  Is it
possible for R to do this over a much larger time period?

3.  If a score reaches a certain point, is it possible for R to determine
the average time between that and when the score reaches another point.  For
example:
01/01/11 01:01 AM
101
01/01/11 02:21 AM
121
01/01/11 03:14 AM
134
01/01/11 04:11 AM
149
01/01/11 05:05 AM
119
01/01/11 06:14 AM
121
01/01/11 07:19 AM
127
01/01/11 08:45 AM
134
01/01/11 09:11 AM
142
01/01/11 10:10 AM
131
The score goes above 120 during the 2 AM hour and doesn't go above 140 until
the 4 AM hour.  Then it goes above 120 again in the 6 AM hour, but doesn't
go above 140 until the 9 AM hour.  So the average time to go from 120 to 140
is 2.5 hours.  Can R does this over a much larger time frame?

If anyone knows how to easily do any of these (particularly the first part),
I'd greatly appreciate it.  

If some of these are possible, but aren't simple commands and require more
in depth programming knowledge and time commitment, can someone at least
tell me what sort of thing to look up?

--
View this message in context: http://r.789695.n4.nabble.com/Questions-about-doing-analysis-based-on-time-tp4634230.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list