[R-sig-finance] bayesian signal classifier
kriskumar at earthlink.net
Tue Nov 29 06:07:52 CET 2005
Here is a simple example of how this could be done. This is using the
(for more on mclust see here
I have attached a test file there are 11 columns the first 10 columns
are various features(signals)
and the last column -11 is the best thing to do based on 20/20
i.e. buy or sell indicator
# Create a matrix out of the data
> finMatrix <- as.matrix(findata[,1:10])
# identify column 11 as the classes and do the classifcation with default values.
> finClass <- findata[,11]
> finMclust <- Mclust(finMatrix,maxG=2)
Where we trained on the entire data set and finMclust$classification gives
the decision made by the classifier.
Now if you want to train on a subset say all the odd rows
[or one can alternatively cross validate with ?sample or even bootstrap a training data set.]
>odd <- seq(from=1, to=nrow(findata), by=2)
>even <- seq(from=2, to=nrow(findata), by=2)
>round(cv1EMtrain(data = findata[odd,-11], labels = findata[odd,11]),3)
This will show that the VVI model would be selected based on the training data(all the odd rows)
> vviModd <- mstepVVI(data=findata[odd,-11], z=unmap(findata[odd,11]))
> vviZ <- do.call("estepVVI", c(vviModd, list(data=findata[,-11])))$z
> classError(map(vviZ[odd,]), findata[odd,11])
How do we do on the test data?
Hmmm. so a classification error of 4% on the test data....whew! (maybe
i am missing something..)
paul sorenson wrote:
>I would be interested in the paper thanks. Unfortunately my level of
>expertise is not high in these matters.
>I may have just misunderstood yours and Krishna's response, the kind of
>paradigm I am thinking is:
> - User selects signals he/she wants to monitor.
> - When the user makes a buy/sell decision, the classifier then looks at
>the parameters of those signals and classifies the conditions for that
> - The user continues to train the classifier in this way, analogously
>to training a spam filter.
> - The classifier then can start emitting buy/sell signals based on the
>training. Ie it is personalized to that users previous choices.
>I only mentioned Bayesian methods because the most effective spam
>filtering I have used is apparently based on that method
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the R-sig-finance