[R] Boosting, bagging and bumping. Questions about R tools and predictions.

Wed Jul 23 02:09:47 CEST 2003

I'm interested in further understanding the differences in using many
classification trees to improve classification rates. I'm also interested
in finding out what I can do in R and which methods will allow prediction.
Can anybody point me to a citation or discussion?

Specifically, I want to classify remotely sensed imagery where training
data is extracted on class membership by the user. That training data
(usually spectral bands and categorical data - e.g., soil type) is classified
(using rpart for instance) and then the resulting tree is applied to
the entire image. This results in a classified image that can then be
checked for accuracy. Classification trees are increasingly used by the
remote sensing folks but it seems like finding optimal trees is an active
area of research in computational statistics.

I've seen great claims made by baggers and boosters (and just what is
bumping?) of increasing classification accuracy but aside from TreeNet
by Salford Systems I'm not aware of tools that can grow forests of trees
that can then be used to make predictions.

Can anybody help?

Promote security and make money with the Hushmail Affiliate Program: