[R-sig-finance] bayesian signal classifier
Krishna Kumar
kriskumar at earthlink.net
Tue Nov 29 06:07:52 CET 2005
Here is a simple example of how this could be done. This is using the
mclust package.
(for more on mclust see here
http://staff.washington.edu/fraley/mclust/tr415R.pdf)
I have attached a test file there are 11 columns the first 10 columns
are various features(signals)
and the last column -11 is the best thing to do based on 20/20
hindsight...
i.e. buy or sell indicator
> require("mclust")
> findata<-read.table("findata.txt",header=F)
# Create a matrix out of the data
> finMatrix <- as.matrix(findata[,1:10])
# identify column 11 as the classes and do the classifcation with default values.
> finClass <- findata[,11]
> finMclust <- Mclust(finMatrix,maxG=2)
> plot(finMclust,finMatrix)
Where we trained on the entire data set and finMclust$classification gives
the decision made by the classifier.
Now if you want to train on a subset say all the odd rows
[or one can alternatively cross validate with ?sample or even bootstrap a training data set.]
>odd <- seq(from=1, to=nrow(findata), by=2)
>even <- seq(from=2, to=nrow(findata), by=2)
>round(cv1EMtrain(data = findata[odd,-11], labels = findata[odd,11]),3)
This will show that the VVI model would be selected based on the training data(all the odd rows)
> vviModd <- mstepVVI(data=findata[odd,-11], z=unmap(findata[odd,11]))
> vviZ <- do.call("estepVVI", c(vviModd, list(data=findata[,-11])))$z
> classError(map(vviZ[odd,]), findata[odd,11])
How do we do on the test data?
>classError(map(vviZ[even,]), findata[even,11])
=== 0.04081633
Hmmm. so a classification error of 4% on the test data....whew! (maybe
i am missing something..)
Later,
Krishna
paul sorenson wrote:
>Guy,
>
>I would be interested in the paper thanks. Unfortunately my level of
>expertise is not high in these matters.
>
>I may have just misunderstood yours and Krishna's response, the kind of
>paradigm I am thinking is:
>
> - User selects signals he/she wants to monitor.
>
> - When the user makes a buy/sell decision, the classifier then looks at
>the parameters of those signals and classifies the conditions for that
>decision.
>
> - The user continues to train the classifier in this way, analogously
>to training a spam filter.
>
> - The classifier then can start emitting buy/sell signals based on the
>training. Ie it is personalized to that users previous choices.
>
>I only mentioned Bayesian methods because the most effective spam
>filtering I have used is apparently based on that method
>(http://spambayes.org).
>
>cheers
>
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: findata.txt
Url: https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20051129/7d5944ba/findata.txt
More information about the R-sig-finance
mailing list