[R] Regressions with monotonicity constraints
Thomas Lumley
tlumley at u.washington.edu
Tue Mar 13 00:19:22 CET 2001
On Mon, 12 Mar 2001, Vadim Ogranovich wrote:
> This seems to be a recurrent topic, but I don't remember hearing a
> definitive answer. I also apologies for cross-posting.
>
> Say I have a numerical response variable and a bunch of multi-level factors
> I want to use for modeling. I don't expect factor interaction to be
> important so there will be no interactions in the model.
> All this would be a perfect job for ANOVA except for one additional
> requirement: all factors are ordered, i.e. for any two levels in a factor
> one is bigger than the other, and I want the regression to preserve the
> ordering, that is to be monotonic.
>
> I can think of two definitions of monotonicity, weak and strong, and in case
> it matters I am more interested in the strong one, which I define as
> follows:
> If L1 < L2 within some factor F than
> prediction(L1, other factors) < prediction(L2, other factors)
> for ANY SELECTION of 'other factors' levels.
> (A weak monotonicity could be defined as prediction(L1) < prediction(L2) on
> average).
in the absence of interactions these are equivalent, surely?
One-dimensional monotonic regression is easy using the
Pool-Adjacent-Violators algorithm. I think an additive monotonic model
could be estimated by backfitting as for generalised linear models: update
the function for each predictor in turn by using the one-dimensional
algorithm on the partial residuals from the previous iteration.
Backfitting is described in "Generalized Additive Models" by Hastie &
Tibshirani (Chapman & Hall) and probably other places.
The more difficult case is a non-additive model (ie interactions). For
discrete factors there were two papers in the Journal of Computational &
Graphical Statistics by Qian and colleagues. One of them was
@article{qian:eddy:1996,
Author = {Qian, Shixian and Eddy, William F.},
Title = {An Algorithm for Isotonic Regression on Ordered Rectangular
Grids},
Year = 1996,
Journal = {Journal of Computational and Graphical Statistics},
Volume = 5,
Pages = {225--235},
Keywords = {[Block class algorithm]; [Monotone regression]; [Recursive
partitioning]; [Sandwich algorithm]}
}
> Any reference to S, R, C code / papers which address this topic will be
> highly appreciated.
The following does one-dimensional isotonic regression. I believe similar
code has been posted before. It is recursive but seems fast enough for
many uses. It returns a compressed form of the result where the first
component has the distinct fitted values and the second indicates the
number of repeats.
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
pava.blocks<-function(x,w=rep(1,length(x)),b=rep(1,length(x)),up=T){
lasti<-1
if (length(x)==1) return(list(x=x,blocks=b,increasing=up))
for (i in 2:length(x)){
if (x[i]<=x[lasti] & up){
wtotal<-w[lasti]+w[i]
x[lasti]<-(x[lasti]*w[lasti]+x[i]*w[i])/wtotal
w[lasti]<-wtotal
b[lasti]<-b[i]+b[lasti]
b[i]<-0
}else if (x[i]<=x[lasti] & !up){
lasti<-i
}else if (x[i]>x[lasti] & !up){
wtotal<-w[lasti]+w[i]
x[lasti]<-(x[lasti]*w[lasti]+x[i]*w[i])/wtotal
w[lasti]<-wtotal
b[lasti]<-b[i]+b[lasti]
b[i]<-0
}else if(x[i]>x[lasti] & up){
lasti<-i
}
}
if (any(b==0)) {
subset<-(b>0)
return(pava.blocks(x[subset],w[subset],b[subset],up))
}
else return(list(x=x,blocks=b,increasing=up))
}
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list