BagBoosting for Tumor Classification with Gene Expression Data
Marcel Dettling
March, 2004
Abstract
Motivation: Microarray experiments are expected to contribute
significantly to progress in cancer treatment by enabling a precise and
early diagnosis. They create a need for class prediction tools that can
deal with a large number of highly correlated input variables, perform
feature selection, and provide class probability estimates that serve as a
quantification of the predictive uncertainty. A very promising solution is
to combine the two ensemble schemes bagging and boosting to a novel
algorithm called BagBoosting.
Results: When bagging is used as a module in boosting, the
resulting classifier consistently improves the predictive performance and
the probability estimates of both bagging and boosting on real and
simulated gene expression data. This quasi-guaranteed improvement can be
obtained by simply making a bigger computing effort. The empirical
advantage is also clearly present when comparing BagBoosting to several
established class prediction tools for microarray data.
Availability: Software for the modified boosting
algorithms, for all other classifiers described in this paper, as well as
for benchmark studies and simulation of microarray data are available for
free as an $R$ package
Download:
Compressed Postscript (122 Kb)
PDF (346 Kb).
Go back to the
Research Reports
from
Seminar für Statistik.