[R-SIG-Finance] kdb and q?

Andrew Piskorski atp at piskorski.com
Sun Jun 20 18:11:39 CEST 2010


On Sat, Jun 19, 2010 at 10:38:19PM +0530, Ajay Shah wrote:
> I wonder what the R community thinks about kdb and the q language.

K and Q always sounded very interesting and worth learning from, but
their closed nature and very high license fees present a rather large
disincentive.  (When learning a new programming language, I'd much
rather choose one that I have the freedom to use however I like down
the road.)

There are a plethora of interesting new database products and startups
around now; Vertica, VectorWise (MonetDB/x100), VoltDB, SciDB (if it
ever gets anywhere), etc.  Unfortunately, I've never seen a good
explanation of how Kdb is similar/different to any of those.

Also, in my own experience the SQL:2003 "OLAP" order-aware operators
(lead, lag, dense_rank over partition by, etc.) are, despite their
clunky syntax, absolutely critical for serious use of market data
(e.g., daily prices or company fundamentals) in a general purpose
RDBMS like Oracle.

Perhaps Q provides similar or better power in a cleaner or faster way,
I don't know.  It would certainly be nice to have it available
directly in your programming language rather than solely in an RDBMS
at the far end of a client-server pipe.  (The same is true for many
other features of the RDBMS; it's shocking how little programming
language environments and libraries have learned from the RDBMS over
the years.)

I've long heard hints that vector languages like K and APL may tend to
map naturally to a more powerful order-aware superset of the
(unordered) relational model, but academic database researchers seem
to have written very little about this.  The only reference I've heard
of that even might be relevant (but which I've never read) is:

  http://www.amazon.com/gp/product/0201593793/102-7773671-0297757
  http://lambda-the-ultimate.org/node/3761#comment-55078

> A person from Morgan Stanley says that R is just great as a PL but for
> the performance issues faced in finance with intra-day data and
> real-time analysis, R is not able to keep up and they are doing a lot
> of things with kdb and q.

Well, Morgan Stanley is where Arthur Whitney wrote A+ before leaving
to do K, and as of 2004 Morgan (according to Whitney) was still using
A+ internally.  So, Morgan probably has the lowest bar of any big firm
to using K, Q, or other "weird" APL-derived languages.

I'd be curious to know who else is using K and its family of products;
also how many just use Kdb as a more-or-less black box vs. are doing
real programming in K and/or Q.  (I suspect that most of Kx Systems'
revenue comes from Kdb; smart move for them.)

Btw, there were other APL shops building databases targeted at Wall
Street, Kdb just seems to have been (by far) the most successful of
them.  Soliton in Canada built TimeSquare (using Sharp APL), which I
evaluated back c. 2001 by implementing (part of) Shasha's FinTime
benchmarks on both TimeSquare and Oracle 8i.  The Soliton folks seemed
pretty sharp, but from what I recall, TimeSquare wasn't really any
faster than Oracle.  (Since we already needed Oracle for other tasks
and didn't really intend to become any APL programmers, that made the
decision straightforward.)  Tick data is likely a different ballgame
from that, of course.

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/



More information about the R-SIG-Finance mailing list