[R-SIG-Finance] [R-sig-finance] Extracting OHLC from trade price series

Jeff Ryan jeff.a.ryan at gmail.com
Thu Feb 28 17:26:14 CET 2008


Hi Yuri:

It is a bug. Sorry and thanks for pointing it out.  I think somewhere
in the process of moving the code to 'xts' the univariate case was
broken. Looking at the code I see that I dropped the 'k' argument for
the periods in the internal calculation of the endpoints - but only
for the univariate case.

I have updated the source on R-forge if you are interested.  Binaries
should follow soon.

# makes the minute data into minute OHLC data - all the same, which is
what you were getting
# the second to.minutes will then work correctly.

With simulated data:
library(xts)
set.seed(100)
mm <- xts(matrix(rnorm(10,50),10,1),as.POSIXct('2007-01-02 08:30:00')
+ sort(ceiling(runif(10)*1000)))

to.minutes(to.minutes(mm),k=5)

-or-

to.minutes5(to.minutes(mm))

# this makes the minute data into minute OHLC data - all the same,
which is what you were getting
# the second to.minutes will then work correctly.

As far as padding NAs I agree that a prettier output would be nice  -
that is 5 minute bars would look better with time stamps at 5 minute
intervals - even if there was no data for that interval.  For the
moment I am not too sure how to implement that well.  I will give it
some thinking though. To that end what do you think it should look
like?

Jeff



On Thu, Feb 28, 2008 at 7:14 AM, Yuri Volchik <yuri.volchik at gmail.com> wrote:
>
>  Hi to all members,
>
>  I presume my question is to Jeff mostly concerning to.minutes:
>  I have a xts object with tick data:
>
>  mm<-xts(dd[,2:4],new.timestamps)
>
>                              BO   Price        Quantity
>  2007-01-02 08:13:52 O  18.50000000   5
>  2007-01-02 08:15:02 B  18.29999924   3
>  2007-01-02 08:15:02 B  18.29999924  10
>  2007-01-02 08:46:31 B  18.00000000  10
>  2007-01-02 09:01:43 B  17.85000038   1
>  2007-01-02 09:04:48 B  17.85000038   9
>  2007-01-02 09:19:58 B  17.85000038   1
>  2007-01-02 09:38:19 B  17.85000038   1
>  2007-01-02 09:54:08 B  17.70000076   5
>  2007-01-02 10:07:25 O  17.79999924   5
>  ...
>
>
>  and trying to convert to 5 min data using
>  xx<-to.minutes(mm[,2],5,'minutes')
>
>                     minutes.Open minutes.High minutes.Low minutes.Close
>  2007-01-02 08:13:52  18.50000000  18.50000000 18.50000000   18.50000000
>  2007-01-02 08:15:02  18.29999924  18.29999924 18.29999924   18.29999924
>  2007-01-02 08:46:31  18.00000000  18.00000000 18.00000000   18.00000000
>  2007-01-02 09:01:43  17.85000038  17.85000038 17.85000038   17.85000038
>  2007-01-02 09:04:48  17.85000038  17.85000038 17.85000038   17.85000038
>  2007-01-02 09:19:58  17.85000038  17.85000038 17.85000038   17.85000038
>  2007-01-02 09:38:19  17.85000038  17.85000038 17.85000038   17.85000038
>  2007-01-02 09:54:08  17.70000076  17.70000076 17.70000076   17.70000076
>  2007-01-02 10:07:28  17.79999924  17.79999924 17.79999924   17.79999924
>  2007-01-02 10:17:22  17.79999924  17.79999924 17.79999924   17.79999924
>  ....
>
>  The question is this output correct, is there a way to convert tick data to
>  a somewhat 'nice' representation with equally spaced time intervals and
>  using specific method of interpolation for missing data (or just leaving
>  them as NA). I created such code in R, but i think it is quite slow.
>
>  Thanks
>
>
>
>
>  Jeff Ryan wrote:
>  >
>  > Hi,
>  >
>  > The package 'xts' (the function in question previously part of
>  > 'quantmod') has a nice and fast aggregation function that allows you
>  > to create OHLC from any univariate series, or from an existing OHLC
>  > series - called 'to.period'.
>  >
>  > library(quantmod)
>  > getSymbols("QQQQ")
>  >
>  > to.monthly(QQQQ)  # yields a monthly series from daily data
>  >
>  > The code works equally well for anything from minute bars on up.  It
>  > should work below that, though I can't promise anything as I haven't
>  > really tested that recently.  Other functions in the group are
>  > to.minutes, to.hourly, to to.daily... you get the idea.
>  >
>  > to.period is the function you want to look at.  It calls Fortran - so
>  > it is very fast on all but gigantic data sets - and then nothing is :)
>  >
>  >
>  >
>
>  --
>  View this message in context: http://www.nabble.com/Extracting-OHLC-from-trade-price-series-tp15579652p15718653.html
>  Sent from the Rmetrics mailing list archive at Nabble.com.
>
>  _______________________________________________
>  R-SIG-Finance at stat.math.ethz.ch mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>  -- Subscriber-posting only.
>  -- If you want to post, subscribe first.
>



-- 
There's a way to do it better - find it.
Thomas A. Edison



More information about the R-SIG-Finance mailing list