[R] mboost: Proportional odds boosting model - how to specify the offset?

Benjamin Hofner benjamin.hofner at fau.de
Fri Mar 20 16:06:17 CET 2015


Dear Madlene,

the problem that you observed was twofold.

First, mboost expects the offset to be a scalar or a vector with length
equal to the number of observations. However, fitted(p.iris) is a 
matrix. In PropOdds(), the linear or additive predictor is shared among 
all outcome categories and the thresholds are treated as nuisance 
parameter. What you need to supply as offset is the result of the linear 
or additive predictor (i.e., x'beta) instead of the fitted class 
probabilities.

Second, there was a bug in mboost. I fixed it on R-forge [1]. If the 
package was successfully built use
   install.packages("mboost", repos="http://R-Forge.R-project.org")
to install it. You can also email to me off list. Then I will send you 
the package sources directly.

Your nuisance parameters (which represent the class thresholds) can be
extracted via nuisance(mlp). More details are given in the example below.

Best,
Benjamin

[1] http://r-forge.r-project.org/projects/mboost/

---- Example code ----

library(MASS)
library(mboost)

data(iris)
iris$Species <- factor(iris$Species, ordered = T)
p.iris <- polr(Species  ~ Sepal.Length, data = iris)
p.iris

lm.iris <- glmboost(Species  ~ Sepal.Length, data = iris,
                     family = PropOdds(nuirange = c(-0.5, 3)))
lm.iris[1000]
## thresholds:
nuisance(lm.iris)

## to make these comparable to p.iris use
nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] -
     attr(coef(lm.iris), "offset")

## now use linear predictor as offset:
mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
                 data = iris, family = PropOdds(nuirange = c(0, 1)),
                 offset = fitted(lm.iris))




Nussbaum  Madlene wrote
> Dear R team
>
> The package mboost allows for boosting of proportional odds models.
> However, I would like to include an offset for every observation.
> This produces an error - no matter how I put the offset (as response
> probabilities or as response link).
>
> Fitting gamboost-models with offset works satisfactory with family =
> Gaussian() or Multinomial().
>
> Questions: 1) How do I need to specify the offset with family =
> PropOdds()?
>
> 2) Where in the mboost-object do I find the Theta's (response
> category dependent intercept)?
 >
 >
 >
 > # --- minimal example with iris data ---
 >
 > library(MASS)
 > library(mboost)
 >
 > data(iris)
 > iris$Species <- factor(iris$Species, ordered = T)
 > p.iris <- polr(Species  ~ Sepal.Length, data = iris)
 > mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
 >                data = iris, family = PropOdds(),
 > 	      offset = fitted(p.iris) )
 >
 > Error in tmp[[i]] : subscript out of bounds
 >
 >
 > Thank you
 > M. Nussbaum
 >
 > --
 >
 > ETH Zürich
 > Madlene Nussbaum
 > Institut für Terrestrische Ökosysteme
 > Boden- und Terrestrische Umweltphysik
 > CHN E 37.2
 > Universitätstrasse 16
 > CH-8092 Zürich
 >
 > Telefon + 44 632 73 21
 > Mobile  + 79 761 34 66
 > madlene.nussbaum at .ethz
 > www.step.ethz.ch



More information about the R-help mailing list