[R-SIG-Finance] Quantmod and Tick Data

Brian G. Peterson brian at braverock.com
Tue Sep 1 13:26:16 CEST 2009


jatin patni wrote:
> I am new to this field(Quantitative high frequency finance) and really would
> like some guidance from senior people. Please guide me to some good
> tutorials for handling dataframes and the xts package. I need to import data
> (time series, tick data) for my backtesting. The problem with tick data is:-
> 1)It has multiple symbols, so when I'm importing the data from a file into a
> dataframe, I have to again extract data into other dataframes with unique
> symbols(since my backtesting strategies will work on one symbol at a time,
> and for general charting purposes)
> 2)It does not have a date field, just the timestamp (hh:mm:ss), so I need to
> add the date field from the filename and convert it to a suitable format
> compatible with Quantmod/xts before starting to use the data for
> backtesting.
> 3)It is large, around 500MB per day(all symbols), so I need to split the
> file into 50000 rows per call into the dataframe. It may be a good idea to
> store it in a binary format of R.
>
> Since I'm a beginner with R, I'm having some stupid troubles handling
> dataframes, for eg. adding date into the columns of timestamps and
> converting it to a compatible format for quantmod/xts
>
> Please also guide me to some backtesting links(tutorials) or packages(like
> quantmod), preferably open source, for backtesting and even for implementing
> my own trading platform(for eg. Marketcetera)
>
>   
Jatin,

As you've probably guessed, most of your questions full into the 
category of "FAQ", so I'm not going to spend much time giving detailed 
answers to those, as a search of the list archives will turn up multiple 
details for read.table and read.zoo

I work with data very much like yours in that data may have multiple 
symbols in one extract from a database or similar.  I'll answer some of 
the particulars of your questions below.

1.
- use 'read.zoo' (or 'read.table' if 'read.zoo' doesn't work),
- use the 'format' argument in as.POSIXct to construct a POSIXct index 
for the zoo object
- simply hardcode the date from the filenames into your call to 
'format'  this should work quite well
- use 'split' to construct a list of zoo objects by symbol
- convert to 'xts' (now that you have unique timestamps) using 'lapply'
- name your columns for each symbol (also using 'lapply', I suspect)
- cbind if that's useful to you to get things out of the list

2.
- see above, use format= with as.POSIXct

3.
I regularly read files with 6-10 million rows into R, and I have read 
much larger files.  This should not be a problem for you.

If these pointers and some searching don't solve your problem, please 
read the posting guide, and reply to the list with a small data sample 
(a few symbols and data points for each would be sufficient) and the 
code you have tried to make it work.  Someone should be able to help you 
from that point, as I think my pointers and some archive searching 
should be sufficient.

After you are able to successfully load and manipulate your data, you 
can ask additional questions regarding backtesting in R once you've had 
a chance to review the list archives and formulate more specific 
questions with the code examples of what you're trying to do.

Regards,

  - Brian

-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock



More information about the R-SIG-Finance mailing list