[R-SIG-Finance] R-finance frameworks (was "Backtesting trade systems")

Mon Jul 20 10:40:44 CEST 2009

Hi all,

Jeff's points are really important. I've also spent many hours with 
him and other debating ways of better coordinating our efforts. 
Here are some thoughts and a proposal:

	THOUGHTS:

1) most work in finance is done closed-source. Finance companies
are reluctant to allow their workers to open source their developments,
or share anything, even if they're using open source tools such as R.
I think this is slowly changing, but still an issue arising all the 
time.

2) even for those of us that have started writting public R packages,
the effort has been relatively isolated. There's no real community
around R and finance yet, as far as I can see. Two groups are
arising around the two annual conferences: one in Switzerland
(R-metrics) and another in Chicago (Jeff, Josh, Brian, Peter..., 
those behind xts, TTR, blotter, PerformanceAnalytics, etc)

But still there're few broader conversations about possible ways
forward in a coordinated way (common needs, frameworks, community).

Let's look for an example at the biostatistics community around the 
Bioconductor project. They have long been working with R. They have
a dedicated package repository, several lists, packages providing
general infrastucture, and lots of specialized packages. I starve
for something like that for R-finance.... 

3) I do think that R itself, being a wonderful platform that is
very flexible and promotes rapid prototyping, in a sense 
doesn't precisely fosters collaboration. Why do I say that?

Because there's an obvious lack of conventions, standards
and guidelines for programming in R. Two "official" object systems,
plus several others created by package writters. No clear coding 
conventions. Lack of modularity on the base code, that does not
lend itself to reusability....

And standards, like them or not, are at the heart of team work 
and collaboration.

	A PROPOSAL:

In my opinion, there are two things we can do to improve 
this situation much before talking about complex thing like 
"general frameworks":

1) Establish common ground via a base set of programming 
guidelines and standards. Just a required base for team work.

2) Establish a "common language" of finance within R. Following
current programming trends, we could call it an R-finance DSL 
(Domain Specific Language).

Within R, a "common language" is defined via the usage of 
generic functions. Generics are at the heart of the R language,
and what R-base does is to define this common language to 
manipulate general data via generics ('c', 'rbind', 'cbind', 
'apply', etc.)

What good R packages do is to extend R behaviour by applying these
generics to new data structures: that's why, in my opinion, 'zoo', 
and then 'xts', were outstanding over initial implementations
of other time series packages: because they followed standard R
generics much closely, reducing the cognitive overload required
to start using the new packages. 

The problem with R in finance is that, lacking guidelines and a 
common language, each package author has come up with  completely 
different conventions and names for the same things.

As a simple example, look at one of the most basic calculations
within finance: compute instrument returns. We have more than
5 different names for the same concept in different packages:

* quantmod:
periodReturn(x, period = "monthly", subset = NULL, 
	type = "arithmetic", leading = TRUE, ...)

* timeSeries
returns(x, 
	method = c("continuous", "discrete", "compound", "simple"), 
	percentage = FALSE, ...)
getReturns(...)
returnSeries(...)		

* PerformanceAnalytics
Return.calculate(prices, method = c("compound", "simple"))
Return.cumulative(R, geometric = TRUE)	

* TTR
ROC(x, n = 1, type = c("continuous", "discrete"), na = NA)

and I think there're several others I'm now missing. Similar
examples would go for all types of funcions related to portfolios,
simulation, backtesting, risk modelling, etc.

What's the point? These differences start to difficult 
interoperability of code, and we have not yet started to talk 
about modelling instruments, or creating frameworks.

Participation of package authors and detailed documentation 
of the design choices taken would be some of the key issues for
a package providing these common generics. Default methods for
the generics could be included but they would not be so
important for a first step.

Using the common generics, package authors could start to
apply those generics and add methods for them. No need to
remake their older APIs, just add the new generics as wrappers
to existing code and slowly migrate.

NOTE: Jeff and I already started to think about this at the last 
Chicago conference, and produced the simplest prototype, called 
"finbase". All R-finance developers are invited to add to this
effort.

Only after that, we could really start to think further 
(frameworks?)

I look forward to your opinions.

Best,

Enrique

------------------------------
Date: Thu, 16 Jul 2009 11:32:31 -0500
From: Jeff Ryan <jeff.a.ryan at gmail.com>
Subject: Re: [R-SIG-Finance] Backtesting trade systems
To: Robert Sams <robert at sanctumfi.com>
Cc: r-sig-finance at stat.math.ethz.ch
Message-ID:
	<e8e755250907160932t41eb4734uef833bd3c371154 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

All,

I suspect this topic/conversation is central to most of us -- and
quite important to R and R-Finance in general.

There are two primary issues that we seem to be encountering at this
point in the R-Finance community:

1) consistency of design: so all of our work becomes additive.
2) generality: writing code that works in the most general case possible

The first is quite difficult.  I have personally been involved in
*hundreds* of hours of productive (?) conversation with dozens of
people on this topic, be it on the list, useR, NY, Meielisalp,
R/Finance Chicago and back again at Meielisalp (and many sessions
involving Jak's tap in Chicago).  All of the conversations have been
certainly valuable, though even after all of this I still can't figure
out the best path...

One thing that I have worked on personally is making consistency
possible.  'xts' was born of this issue.  We didn't need another
time-series class.  We had 9 or 10.  We needed (or I needed) a way to
abstract the class and just get the job done.  'xts' does that
reasonably well -- though still far from complete.  The same issue
exists for portfolio level data, as well numerous other 'classes'.

Everyone should be free to use their own data object, what would be
nice (and required for maximum usefulness in my opinion) is if that
choice wasn't imposed on the end-user to use your software.  This
leads to more users, more feedback, better code, and a more 'complete'
landscape of tools that can be used.

As a time series example -- xts lets the developer use one type of
object (simplifying code), and accept *ALL* objects (simplifying the
user experience).

The point of the above is that interface counts, and it counts for a
lot.  If a framework is to work, it needs to be accessible to all
users.

The second point I'll toss out there is one of generality.

This is just as difficult as the first conceptually, though I would
argue possibly even more intractable.  We just can't individually
understand what we collectively require.

quantmod set about in 2007 trying to create a 'framework' and has
failed miserably.  It does some things like data management and
charting quite well, but the often asked question is how does
specifyModel/buildModel/tradeModel work.  The short answer is that it
does, and it doesn't.  It works on one type of strategy (EOD), and
doesn't on others:

(...)

Looks great!  But it is useless.  It is a casualty of specificity.  A
generalized framework would be awesome, but I think it is a much
larger task than I can handle.  I suspect that general to one person
is fantastically restrictive to another.

R is the 'general' framework.

What we really need is consistency of the pieces that make
building/testing models in *R* easier.

This comes from projects like blotter I think.  Were we can use parts
that we want, and only those that we want.

As I said at the start, this is a lot more involved than it looks
like.  Obviously best of luck to all who take up the challenge.  This
thread is an awesome start to the (larger) public conversation that
has been taking place over beers for quite some time now.

Best,
Jeff

-- 
Jeffrey Ryan
jeffrey.ryan at insightalgo.com

ia: insight algorithmics
www.insightalgo.com