[R] XGBoost continuos outcome case --- reg:linear in R
Sandeep Rana
sunnysingha.analytics at gmail.com
Tue Feb 9 07:18:23 CET 2016
Hi,
While learning how to implement XGBoost in R I came across below case and want to know how to go about it.
Outcome variable: continous
independent features: mix of categorical and continuous
nrow(train_set): 8523
Since, XGBoost natively supports only numeric features, I applied one hot encoding on the training data set:
target <- train_set$Outlet_sales
sparsed_train_set <- sparse.model.matrix(~.-1, data=train_set)
nrow(sparsed_train_set) : 4526 #As expected, the row count is reduced.
Note: The target variable is continuous and has as many rows as in train_set i.e 8523, before one hot encoding is applied.
# To build mode:
bst <- xgboost(data = sparsed_train_set, label = target, max.depth = 4,
eta = 1, nthread = 4, nround = 50, objective=reg:linear)
# Above execution would fail as
My questions:
- How should I handle above disparity between sparsed training data and label while building the model ?
- How should I use XGBoost to perform regression where outcome is continuous ? Most of the web portals refers to the cases related to classification.
If any could lead me to the source explaining this. I have gone through the documentation but not much cleared in this case.
Regards,
Sandeep S. Rana
[[alternative HTML version deleted]]
More information about the R-help
mailing list