[R] Questions about doing analysis based on time
R. Michael Weylandt
michael.weylandt at gmail.com
Fri Jun 22 21:25:30 CEST 2012
On Fri, Jun 22, 2012 at 2:18 PM, John Kane <jrkrideau at inbox.com> wrote:
> Hi and welcome to the R-help list.
>
> It would be much better for readers to get your data in a more easily used format.
>
> There is a function called dput() that will output your data in a way that R can read easily.
>
> We don't need to see all the data but perhaps hundred lines of it would be nice.
>
> Try this where your file is called "mydata"
> # just copy the line below and paste into R
> head(mydata, 100)
I think you mean dput(head(mydata, 100))
OP: Once you put this up I'll give more reply, but for now I'd suggest
you try to put your data in a proper time series class (zoo/xts if I
might give a personal-ish plug) which will make all these calculations
much easier.
Best,
Michael
>
> # Now copy the output and paste it into your wordprocess as a reply to the list and we will have decent data to work with.
>
> John Kane
> Kingston ON Canada
>
>
>> -----Original Message-----
>> From: mikeedinger16 at gmail.com
>> Sent: Fri, 22 Jun 2012 09:21:40 -0700 (PDT)
>> To: r-help at r-project.org
>> Subject: [R] Questions about doing analysis based on time
>>
>>
>> I have a spreadsheet that I've read into R using read.csv. I've also
>> attached it. It looks like this (except there are 1600+ entries):
>>
>>> Sunday
>> SunDate SunTime SunScore
>> 1 5/9/2010 0:00 0:00 127
>> 2 6/12/2011 0:00 0:00 125
>> 3 6/15/2008 0:04 0:04 98
>> 4 8/3/2008 0:07 0:07 118
>> 5 7/24/2011 0:07 0:07 122
>> 6 5/25/2008 0:09 0:09 104
>> 7 5/20/2012 0:11 0:11 124
>> 8 10/18/2009 0:12 0:12 121
>> 9 3/14/2010 0:12 0:12 117
>> 10 1/2/2011 0:12 0:12 131
>>
>> SunDate and SunTime are both factors. In order to change the class to
>> something I can work with, I use the following:
>>
>> Sunday$SunTime<-as.POSIXlt(SunTime,tz=””,”%H:%M”)
>> Sunday$SunDate<-as.POSIXlt(SunDate,tz=””,”%m/%d/%Y %H:%M”)
>>
>> Now, the str(Sunday) command yields:
>>
>> 'data.frame': 1644 obs. of 3 variables:
>> $ SunDate : POSIXlt, format: "2010-05-09 00:00:00" "2011-06-12 00:00:00"
>> ...
>> $ SunTime : POSIXlt, format: "2012-06-18 00:00:00" "2012-06-18 00:00:00"
>> ...
>> $ SunScore: int 127 125 98 118 122 104 124 121 117 131 ...
>>
>> I think all the elements in Sunday are correct for me to do what I want
>> to
>> do, but I don't know how to do them.
>>
>> 1. How can I get the mean score by hour? For example, I want the mean
>> score
>
>
>> of all the entries between 0:00 and 0:59, then 1:00 and 1:59, etc.
>> 2. Is it possible for me to create a histogram by hour for each score
>> over a
>> certain point? For example, I want to make a histogram of all scores
>> above
>> 140 by the hour they occurred in. Is that possible?
>>
>> These last few might not be possibe (at least with R), but I'll ask
>> anyway.
>> I've got another data set similar to the one above, except it's got
>> 12,000
>> entries over four years. If I do the same commands as above to turn Date
>> and Time into POSIXlt, is it possible for me to do the following:
>>
>> 1. The data was recorded at irregular intervals, and the difference
>> between
>> recorded points can range from anywhere between 1 hour and up to 7. Is
>> it
>> possible, when data isn't recorded between two points, to insert the
>> hours
>> that are unrecorded along with the average of what that hour is. This is
>> sort of a pre-requisite for the next two.
>> 2. If one of the entries has a Score above a certain point, is it
>> possible
>> to determine how long it was above that point and determine the mean for
>> all
>> the instances this occurred. For example:
>> 01/01/11 01:00 AM
>> 101
>> 01/01/11 02:21 AM
>> 142
>> 01/01/11 03:36 AM
>> 156
>> 01/01/11 04:19 AM
>> 130
>> 01/01/11 05:12 AM
>> 146
>> 01/01/11 06:49 AM
>> 116
>> 01/01/11 07:09 AM
>> 111
>> There are two spans where it's above 140. The two and three o'clock
>> hours,
>> and the 5 o'clock hour. So the mean time would be 1.5 hours. Is it
>> possible for R to do this over a much larger time period?
>>
>> 3. If a score reaches a certain point, is it possible for R to determine
>> the average time between that and when the score reaches another point.
>> For
>> example:
>> 01/01/11 01:01 AM
>> 101
>> 01/01/11 02:21 AM
>> 121
>> 01/01/11 03:14 AM
>> 134
>> 01/01/11 04:11 AM
>> 149
>> 01/01/11 05:05 AM
>> 119
>> 01/01/11 06:14 AM
>> 121
>> 01/01/11 07:19 AM
>> 127
>> 01/01/11 08:45 AM
>> 134
>> 01/01/11 09:11 AM
>> 142
>> 01/01/11 10:10 AM
>> 131
>> The score goes above 120 during the 2 AM hour and doesn't go above 140
>> until
>> the 4 AM hour. Then it goes above 120 again in the 6 AM hour, but
>> doesn't
>> go above 140 until the 9 AM hour. So the average time to go from 120 to
>> 140
>> is 2.5 hours. Can R does this over a much larger time frame?
>>
>> If anyone knows how to easily do any of these (particularly the first
>> part),
>> I'd greatly appreciate it.
>>
>> If some of these are possible, but aren't simple commands and require
>> more
>> in depth programming knowledge and time commitment, can someone at least
>> tell me what sort of thing to look up?
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Questions-about-doing-analysis-based-on-time-tp4634230.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
> .
>
> ____________________________________________________________
> FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list