[R-SIG-Finance] Speed optimization on minutes distribution calculation

Jeff Ryan jeff.a.ryan at gmail.com
Tue Jun 16 05:45:17 CEST 2009


An actual example using IBrokers/IB

NQ <- reqHistoricalData(tws,
          twsFUT("NQ","GLOBEX","200909"),
          useRTH="0", bar="1 min", dur="5 D")

str(NQ)
An 'xts' object from 2009-06-09 15:30:00 to 2009-06-15 22:33:00 containing:
  Data: num [1:5910, 1:8] 1510 1510 1510 1510 1510 1510 1510 1510 1510 1510 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:8] "NQU9.Open" "NQU9.High" "NQU9.Low" "NQU9.Close" ...
  Indexed by objects of class: [POSIXt,POSIXct] TZ: America/Chicago
  xts Attributes:
List of 4
 $ from   : chr "20090611  04:33:46"
 $ to     : chr "20090616  04:33:46"
 $ src    : chr "IB"
 $ updated: POSIXct[1:1], format: "2009-06-15 22:33:46.46141"

nqi <- index(NQ)
hm <- as.POSIXlt(nqi)$min + as.POSIXlt(nqi)$hour*100
NQV <- aggregate(Vo(NQ), as.factor(hm), sum)

barplot(NQV)

The axis/chart leaves a lot to be desired, but once again that should
be enough to set you on the right path.

HTH
Jeff

On Mon, Jun 15, 2009 at 10:24 PM, Jeff Ryan<jeff.a.ryan at gmail.com> wrote:
> I think you want something like ?aggregate.zoo
>
> I didn't pull actual volume data, but here is an example that will
> show what you can do:
>
> library(xts)  ## only used for the sequence and to leverage
> aggregate.zoo internally.
>
> ## generate a sequence of POSIXct 1 mo @ 1min
> x <- timeBasedSeq('20090515/20090615 12:00')
>
> ## convert to POSIXlt and turn into HHMM numeric format
> hm <- as.POSIXlt(x)$min + as.POSIXlt(x)$hour * 100
>
> ##  your original "Volume" column (here a simple xts object with each
> min having Vol=1000)
> ##  There are 32 observations at each minute in 00:00--12:00 and 31
> for 12:01--23:59
> xx <- xts(rep(1000,length(x)), x)
>
> ##  using 'aggregate' to apply sum to the matching times
> ax <- aggregate(xx, as.factor(hm), sum)
>
> head(ax)
>
> 0 32000
> 1 32000
> 2 32000
> 3 32000
> 4 32000
> 5 32000
>> tail(ax)
>
> 2354 31000
> 2355 31000
> 2356 31000
> 2357 31000
> 2358 31000
> 2359 31000
>
> I haven't had a chance to actually test this, but at the very least it
> should provide a start for you.
>
> And the above is very fast:
>
>  system.time(ax <- aggregate(xx, as.factor(hm), sum))
>   user  system elapsed
>  0.058   0.015   0.073
>
> HTH
> Jeff
> On Mon, Jun 15, 2009 at 9:27 PM, Wind<windspeedo99 at gmail.com> wrote:
>> periodicity() function in xts is a good tool for axis manipulation.
>>
>> Maybe I should not use character string methods to complie the
>> distribution of minutes volume, as Brian suggested.   But what
>> function should be used for such task in R?  I've tried in kdb+ , it
>> is  somewhat simple and quick enough with select and xbar function.
>> But I am not familiar with R.  Maybe there is some functions for this
>> specific task I don't know.
>>
>> Thanks Brian.
>>
>>
>> On Tue, Jun 16, 2009 at 8:00 AM, Brian G. Peterson<brian at braverock.com> wrote:
>>> It seems that the slow part is all the character string manipulation.  This
>>> would be slow in almost any programming language.   Honestly, I am always
>>> annoyed by useless axes in charts that simply count from 1 to n.  A time
>>> axis at least has some real meaning, and avoids the useless rewriting of
>>> character strings.
>>>
>>> You should be able to get a meaningful, readable axis using the
>>> periodicity() function in xts without the string manipulation.
>>>
>>> Regards,
>>>
>>>   - Brian
>>>
>>> Wind wrote:
>>>>
>>>> I want to plot the distribution of volume of the future  CLN9 along
>>>> the 24 hours axis.   The following codes could complete the task.  But
>>>> it is very time consuming when sapply(mins,function(x)
>>>> {mean(hqm[which(format(index(hqm),"%H:%M")==x),5])}).
>>>> Any suggestion for codes with better performance would be highly
>>>> appreciated.
>>>>
>>>>
>>>> The data hqm has been retrieved from IB via IBrokers.
>>>>
>>>>
>>>>>
>>>>> head(hqm[,5])
>>>>>
>>>>
>>>>                    CLN9.Volume
>>>> 2009-05-25 06:00:00          17
>>>> 2009-05-25 06:01:00           2
>>>> 2009-05-25 06:02:00          11
>>>> 2009-05-25 06:03:00          26
>>>> 2009-05-25 06:04:00          20
>>>> 2009-05-25 06:05:00           5
>>>>
>>>>>
>>>>> tail(hqm[,5])
>>>>>
>>>>
>>>>                    CLN9.Volume
>>>> 2009-06-15 21:51:00        1050
>>>> 2009-06-15 21:52:00         807
>>>> 2009-06-15 21:53:00         782
>>>> 2009-06-15 21:54:00         385
>>>> 2009-06-15 21:55:00         562
>>>> 2009-06-15 21:56:00         423
>>>>
>>>>>
>>>>>
>>>>> mins<-unlist(lapply(0:23,function(h){sapply(0:59,function(m){paste(sprintf("%02d",h),sprintf("%02d",m),sep=":")})}))
>>>>> head(mins)
>>>>>
>>>>
>>>> [1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05"
>>>>
>>>>>
>>>>> tail(mins)
>>>>>
>>>>
>>>> [1] "23:54" "23:55" "23:56" "23:57" "23:58" "23:59"
>>>>
>>>>
>>>>>
>>>>> temp<-sapply(mins,function(x)
>>>>> {mean(hqm[which(format(index(hqm),"%H:%M")==x),5])})
>>>>> head(temp)
>>>>>
>>>>
>>>>   00:00    00:01    00:02    00:03    00:04    00:05
>>>> 279.1333 284.9333 247.8667 176.3333 278.8667 179.0667
>>>>
>>>>>
>>>>> tail(temp)
>>>>>
>>>>
>>>>   23:54    23:55    23:56    23:57    23:58    23:59
>>>> 250.2667 312.7333 318.9333 210.8000 258.2000 232.8667
>>>>
>>>>>
>>>>> plot(temp)
>>>>>
>>>>
>>>> _______________________________________________
>>>> R-SIG-Finance at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>>>> -- Subscriber-posting only.
>>>> -- If you want to post, subscribe first.
>>>>
>>>
>>>
>>> --
>>> Brian G. Peterson
>>> http://braverock.com/brian/
>>> Ph: 773-459-4973
>>> IM: bgpbraverock
>>>
>>>
>>>
>>
>> _______________________________________________
>> R-SIG-Finance at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only.
>> -- If you want to post, subscribe first.
>>
>
>
>
> --
> Jeffrey Ryan
> jeffrey.ryan at insightalgo.com
>
> ia: insight algorithmics
> www.insightalgo.com
>



-- 
Jeffrey Ryan
jeffrey.ryan at insightalgo.com

ia: insight algorithmics
www.insightalgo.com



More information about the R-SIG-Finance mailing list