Mark Knecht markknecht at gmail.com
Wed Jan 5 18:57:59 CET 2011

```On Wed, Jan 5, 2011 at 9:21 AM, David St John <dstjohn at math.uic.edu> wrote:
> Mark,
>
> I would suggest thinking about correlation between the returns.  For
> example, using daily data for SPY, DJI, QQQQ, I see the correlation in the
> returns using the following calls:
>
>> library(quantmod)
>
>> getSymbols(c('SPY','^DJI','QQQQ'))
>
> [1] "SPY" "DJI" "QQQQ"
>
>> x1 <- log(as.vector(as.matrix(SPY)[,4])/(as.vector(as.matrix(SPY)[,1])))
>
>> x2 <- log(as.vector(as.matrix(QQQQ)[,4])/(as.vector(as.matrix(QQQQ)[,1])))
>
>
>> x3 <- log(as.vector(as.matrix(DJI)[,4])/(as.vector(as.matrix(DJI)[,1])))
>
>> data <- cbind(x1,x2,x3)
>
>> cor(data)
>
> x1 x2 x3
>
> x1 1.0000000 0.9067068 0.8284568
>
> x2 0.9067068 1.0000000 0.7556838
>
> x3 0.8284568 0.7556838 1.0000000
>
> In this example I just used log(close/open) as the one-period return.
>
> So, if you want to see if your systems are correlated, I would suggest
> defining the systems' return series by multiplying the one period returns
> with the position indicated by the system.
>
> I generated a random 'system' as a sequence of -1, 0, and 1 values as an
> example:
>
>> z1 <- rnorm(length(x1))
>
>> z2 <- rnorm(length(x2))
>
>> z3 <- rnorm(length(x3))
>
>> s1 <- ifelse(z1>1,1,0)-ifelse(z1<-1,1,0)
>
>> s2 <- ifelse(z2>1,1,0)-ifelse(z2<-1,1,0)
>
>> s3 <- ifelse(z3>1,1,0)-ifelse(z3<-1,1,0)
>
>> signal <- cbind(s1,s2,s3)
>
>  > cor(data*signal)
>
> x1 x2 x3
>
> x1 1.0000000 0.7682398 0.6788077
>
> x2 0.7682398 1.0000000 0.6088950
>
> x3 0.6788077 0.6088950 1.0000000
>
> As you can see, even if you were just randomly buying and selling these
> contracts at odd times, but most of the time had no position (68% of the
> time in my example), then the systems will still be very strongly
> correlated. Since you are trying to find comparatively lower correlation
> between market/system pairs, I think this would be as fine a measure as
> any.  Just look for the smallest entries in your correlation matrix.
>
>
> Hope this helps,
> -David

David,
Thanks. It helps a lot. I like it because it's simple, the amount
of data being correlated isn't huge (daily returns for a few years, in
sample, out of sample, etc.) and could easily scale to hourly or some
other time scale fairly easily. I'll likely set up my data to take a
cut at this method in the next day or two and post back some results.
Thanks.

The one thing I don't have yet is how to pick the 'best' 5 out of
50, where 'best' implies something I'm interested in. (Profit, low
drawdowns, etc.) I can write a fitness function for what interests me.
The next step is finding a good solution for maximizing what I
consider 'best'. I currently have maybe a hundred trading systems to
poke around with in this manner. I'd like to find 5 that work well
together, doing different things to earn their keep, one making money
when another isn't.

There is a thread on the main R-users list right now called
"Cost-benefit/value for money analysis" that might be appropriate if I
let the value be how these respond to a custom fitness function and
limit the number chosen to 5.

Cheers,
Mark

```