[R-SIG-Finance] Speed optimization on minutes distribution calculation
Jeff Ryan
jeff.a.ryan at gmail.com
Tue Jun 16 05:45:17 CEST 2009
An actual example using IBrokers/IB
NQ <- reqHistoricalData(tws,
twsFUT("NQ","GLOBEX","200909"),
useRTH="0", bar="1 min", dur="5 D")
str(NQ)
An 'xts' object from 2009-06-09 15:30:00 to 2009-06-15 22:33:00 containing:
Data: num [1:5910, 1:8] 1510 1510 1510 1510 1510 1510 1510 1510 1510 1510 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:8] "NQU9.Open" "NQU9.High" "NQU9.Low" "NQU9.Close" ...
Indexed by objects of class: [POSIXt,POSIXct] TZ: America/Chicago
xts Attributes:
List of 4
$ from : chr "20090611 04:33:46"
$ to : chr "20090616 04:33:46"
$ src : chr "IB"
$ updated: POSIXct[1:1], format: "2009-06-15 22:33:46.46141"
nqi <- index(NQ)
hm <- as.POSIXlt(nqi)$min + as.POSIXlt(nqi)$hour*100
NQV <- aggregate(Vo(NQ), as.factor(hm), sum)
barplot(NQV)
The axis/chart leaves a lot to be desired, but once again that should
be enough to set you on the right path.
HTH
Jeff
On Mon, Jun 15, 2009 at 10:24 PM, Jeff Ryan<jeff.a.ryan at gmail.com> wrote:
> I think you want something like ?aggregate.zoo
>
> I didn't pull actual volume data, but here is an example that will
> show what you can do:
>
> library(xts) ## only used for the sequence and to leverage
> aggregate.zoo internally.
>
> ## generate a sequence of POSIXct 1 mo @ 1min
> x <- timeBasedSeq('20090515/20090615 12:00')
>
> ## convert to POSIXlt and turn into HHMM numeric format
> hm <- as.POSIXlt(x)$min + as.POSIXlt(x)$hour * 100
>
> ## your original "Volume" column (here a simple xts object with each
> min having Vol=1000)
> ## There are 32 observations at each minute in 00:00--12:00 and 31
> for 12:01--23:59
> xx <- xts(rep(1000,length(x)), x)
>
> ## using 'aggregate' to apply sum to the matching times
> ax <- aggregate(xx, as.factor(hm), sum)
>
> head(ax)
>
> 0 32000
> 1 32000
> 2 32000
> 3 32000
> 4 32000
> 5 32000
>> tail(ax)
>
> 2354 31000
> 2355 31000
> 2356 31000
> 2357 31000
> 2358 31000
> 2359 31000
>
> I haven't had a chance to actually test this, but at the very least it
> should provide a start for you.
>
> And the above is very fast:
>
> system.time(ax <- aggregate(xx, as.factor(hm), sum))
> user system elapsed
> 0.058 0.015 0.073
>
> HTH
> Jeff
> On Mon, Jun 15, 2009 at 9:27 PM, Wind<windspeedo99 at gmail.com> wrote:
>> periodicity() function in xts is a good tool for axis manipulation.
>>
>> Maybe I should not use character string methods to complie the
>> distribution of minutes volume, as Brian suggested. But what
>> function should be used for such task in R? I've tried in kdb+ , it
>> is somewhat simple and quick enough with select and xbar function.
>> But I am not familiar with R. Maybe there is some functions for this
>> specific task I don't know.
>>
>> Thanks Brian.
>>
>>
>> On Tue, Jun 16, 2009 at 8:00 AM, Brian G. Peterson<brian at braverock.com> wrote:
>>> It seems that the slow part is all the character string manipulation. This
>>> would be slow in almost any programming language. Honestly, I am always
>>> annoyed by useless axes in charts that simply count from 1 to n. A time
>>> axis at least has some real meaning, and avoids the useless rewriting of
>>> character strings.
>>>
>>> You should be able to get a meaningful, readable axis using the
>>> periodicity() function in xts without the string manipulation.
>>>
>>> Regards,
>>>
>>> - Brian
>>>
>>> Wind wrote:
>>>>
>>>> I want to plot the distribution of volume of the future CLN9 along
>>>> the 24 hours axis. The following codes could complete the task. But
>>>> it is very time consuming when sapply(mins,function(x)
>>>> {mean(hqm[which(format(index(hqm),"%H:%M")==x),5])}).
>>>> Any suggestion for codes with better performance would be highly
>>>> appreciated.
>>>>
>>>>
>>>> The data hqm has been retrieved from IB via IBrokers.
>>>>
>>>>
>>>>>
>>>>> head(hqm[,5])
>>>>>
>>>>
>>>> CLN9.Volume
>>>> 2009-05-25 06:00:00 17
>>>> 2009-05-25 06:01:00 2
>>>> 2009-05-25 06:02:00 11
>>>> 2009-05-25 06:03:00 26
>>>> 2009-05-25 06:04:00 20
>>>> 2009-05-25 06:05:00 5
>>>>
>>>>>
>>>>> tail(hqm[,5])
>>>>>
>>>>
>>>> CLN9.Volume
>>>> 2009-06-15 21:51:00 1050
>>>> 2009-06-15 21:52:00 807
>>>> 2009-06-15 21:53:00 782
>>>> 2009-06-15 21:54:00 385
>>>> 2009-06-15 21:55:00 562
>>>> 2009-06-15 21:56:00 423
>>>>
>>>>>
>>>>>
>>>>> mins<-unlist(lapply(0:23,function(h){sapply(0:59,function(m){paste(sprintf("%02d",h),sprintf("%02d",m),sep=":")})}))
>>>>> head(mins)
>>>>>
>>>>
>>>> [1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05"
>>>>
>>>>>
>>>>> tail(mins)
>>>>>
>>>>
>>>> [1] "23:54" "23:55" "23:56" "23:57" "23:58" "23:59"
>>>>
>>>>
>>>>>
>>>>> temp<-sapply(mins,function(x)
>>>>> {mean(hqm[which(format(index(hqm),"%H:%M")==x),5])})
>>>>> head(temp)
>>>>>
>>>>
>>>> 00:00 00:01 00:02 00:03 00:04 00:05
>>>> 279.1333 284.9333 247.8667 176.3333 278.8667 179.0667
>>>>
>>>>>
>>>>> tail(temp)
>>>>>
>>>>
>>>> 23:54 23:55 23:56 23:57 23:58 23:59
>>>> 250.2667 312.7333 318.9333 210.8000 258.2000 232.8667
>>>>
>>>>>
>>>>> plot(temp)
>>>>>
>>>>
>>>> _______________________________________________
>>>> R-SIG-Finance at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>>>> -- Subscriber-posting only.
>>>> -- If you want to post, subscribe first.
>>>>
>>>
>>>
>>> --
>>> Brian G. Peterson
>>> http://braverock.com/brian/
>>> Ph: 773-459-4973
>>> IM: bgpbraverock
>>>
>>>
>>>
>>
>> _______________________________________________
>> R-SIG-Finance at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only.
>> -- If you want to post, subscribe first.
>>
>
>
>
> --
> Jeffrey Ryan
> jeffrey.ryan at insightalgo.com
>
> ia: insight algorithmics
> www.insightalgo.com
>
--
Jeffrey Ryan
jeffrey.ryan at insightalgo.com
ia: insight algorithmics
www.insightalgo.com
More information about the R-SIG-Finance
mailing list