[R] mboost: how to implement cost-sensitive boosting family

Tang Yuchun tyczjs at yahoo.com
Wed Feb 3 17:38:36 CET 2010


mboost contains a blackboost method to build tree-based boosting models. I tried to write my own "cost-sensitive" ada family. But obviously my understanding to implement ngradient, loss, and offset functions is not right. I would greatly appreciate if anyone can help me out, or show me how to write a cost-sensitive family, thanks!

Follows are some families I wrote 

ngradient <- function (y, f, w = 1) 
{
  y * ifelse(y==1,10,1) * exp(-y * f * ifelse(y==1,10,1))
}

loss <- function (y, f) 
{
  exp(-y * f * ifelse(y==1,10,1))
}

offset <- function (y, w) 
{
  p <- weighted.mean(y > 0, w)
  1/(10+1) * log(10*p/1*(1 - p))
}

CSAdaExp <- Family(ngradient = ngradient, loss = loss, offset = offset);

model.blackboost <- blackboost(tr[,1:DIM], tr.y, family=CSAdaExp,
weights=tr.w, control=boost_control(mstop=100, nu=0.1),
tree_controls=ctree_control(teststat = "max",testtype =
"Teststatistic",mincriterion = 0,maxdepth = 10));

or 

#loss <- function (y, f) 
#{
#  exp(-y * f * ifelse(y==1,COST_FN,COST_FP))
#}

#ngradient <- function (y, f, w = 1) 
#{
#  y * ifelse(y==1,COST_FN,COST_FP) * exp(-y * f * ifelse(y==1,COST_FN,COST_FP))
#}

#offset <- function (y, w) 
#{
#  p <- weighted.mean(y > 0, w)
#  1/(COST_FN+COST_FP) * log(COST_FN*p/COST_FP*(1 - p))
#}

loss <- function (y, f) 
{
  ifelse(y==1, 1/(1+exp(0.001*y*f)), log(1+exp(-y*f)) )
}

ngradient <- function (y, f, w = 1) 
{
  ifelse(y==1, 0.001*exp(0.001*y*f)/((1+exp(0.001*y*f))^2), exp(-y*f)/(1+exp(-y*f)) )
}

CSAdaExp <- Family(ngradient = ngradient, loss = loss);

model.blackboost <- blackboost(tr[,1:DIM], tr.y, family=CSAdaExp,
weights=NULL, control=boost_control(mstop=MSTOP,
nu=0.1,savedata=TRUE,save_ensembless=TRUE,trace=TRUE),
tree_controls=ctree_control(teststat = "max",testtype =
"Teststatistic",mincriterion = 0,minsplit = 2000, minbucket =
700,maxdepth = TREEDEPTH));


 --------------------------------
regards,
Yuchun Tang, Ph.D.

Principal Engineer, Lead
McAfee, Inc.
4501 North Point Parkway
Suite 300
Alpharetta, GA  30022

Main: 770.776.2685

www.mcafee.com
www.trustedsource.org
www.linkedin.com/in/yuchuntang



More information about the R-help mailing list