[R] mboost: Proportional odds boosting model - how to specify the offset?
Benjamin Hofner
benjamin.hofner at fau.de
Fri Mar 20 16:06:17 CET 2015
Dear Madlene,
the problem that you observed was twofold.
First, mboost expects the offset to be a scalar or a vector with length
equal to the number of observations. However, fitted(p.iris) is a
matrix. In PropOdds(), the linear or additive predictor is shared among
all outcome categories and the thresholds are treated as nuisance
parameter. What you need to supply as offset is the result of the linear
or additive predictor (i.e., x'beta) instead of the fitted class
probabilities.
Second, there was a bug in mboost. I fixed it on R-forge [1]. If the
package was successfully built use
install.packages("mboost", repos="http://R-Forge.R-project.org")
to install it. You can also email to me off list. Then I will send you
the package sources directly.
Your nuisance parameters (which represent the class thresholds) can be
extracted via nuisance(mlp). More details are given in the example below.
Best,
Benjamin
[1] http://r-forge.r-project.org/projects/mboost/
---- Example code ----
library(MASS)
library(mboost)
data(iris)
iris$Species <- factor(iris$Species, ordered = T)
p.iris <- polr(Species ~ Sepal.Length, data = iris)
p.iris
lm.iris <- glmboost(Species ~ Sepal.Length, data = iris,
family = PropOdds(nuirange = c(-0.5, 3)))
lm.iris[1000]
## thresholds:
nuisance(lm.iris)
## to make these comparable to p.iris use
nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] -
attr(coef(lm.iris), "offset")
## now use linear predictor as offset:
mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
data = iris, family = PropOdds(nuirange = c(0, 1)),
offset = fitted(lm.iris))
Nussbaum Madlene wrote
> Dear R team
>
> The package mboost allows for boosting of proportional odds models.
> However, I would like to include an offset for every observation.
> This produces an error - no matter how I put the offset (as response
> probabilities or as response link).
>
> Fitting gamboost-models with offset works satisfactory with family =
> Gaussian() or Multinomial().
>
> Questions: 1) How do I need to specify the offset with family =
> PropOdds()?
>
> 2) Where in the mboost-object do I find the Theta's (response
> category dependent intercept)?
>
>
>
> # --- minimal example with iris data ---
>
> library(MASS)
> library(mboost)
>
> data(iris)
> iris$Species <- factor(iris$Species, ordered = T)
> p.iris <- polr(Species ~ Sepal.Length, data = iris)
> mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
> data = iris, family = PropOdds(),
> offset = fitted(p.iris) )
>
> Error in tmp[[i]] : subscript out of bounds
>
>
> Thank you
> M. Nussbaum
>
> --
>
> ETH Zürich
> Madlene Nussbaum
> Institut für Terrestrische Ökosysteme
> Boden- und Terrestrische Umweltphysik
> CHN E 37.2
> Universitätstrasse 16
> CH-8092 Zürich
>
> Telefon + 44 632 73 21
> Mobile + 79 761 34 66
> madlene.nussbaum at .ethz
> www.step.ethz.ch
More information about the R-help
mailing list