[R] Robust Non-linear Regression

Martin Maechler maechler at stat.math.ethz.ch
Mon Nov 14 12:41:16 CET 2005


Package 'sfsmisc' has had a function  'rnls()' for a while 
which does robust non-linear regression via M-estimation.

[The name of the function is probably *really* a misnomer,
 since the 'ls' part stands for "least squares"!]

Two weeks ago, there's been a small workshop
"Robustness and R" in Treviso (It),
http://www.dst.unive.it/rsr/

where we've talked about available and missing robustness
functionality ``in R''.  One consequence of the workshop is the
new mailing list "R-SIG-Robust" {to which I CC this message} 
and another planned and hopefully even more consequential
consequence will be collaboration on producing more widely
available robustness functionality for R.  Do subscribe to the
list if you are interested.

>>>>> "Vermeiren" == Vermeiren, Hans [VRCBE] <Vermeiren>
>>>>>     on Sun, 13 Nov 2005 22:47:31 +0100 writes:

    Vermeiren> Hi, I'm trying to use Robust non-linear
    Vermeiren> regression to fit dose response curves.  Maybe I
    Vermeiren> didnt look good enough, but I dind't find robust
    Vermeiren> methods for NON linear regression implemented in
    Vermeiren> R. A method that looked good to me but is
    Vermeiren> unfortunately not (yet) implemented in R is
    Vermeiren> described in
    Vermeiren> http://www.graphpad.com/articles/RobustNonlinearRegression_files/frame.htm

 
    Vermeiren> in short: instead of using the premise that the
    Vermeiren> residuals are gaussian they propose a Lorentzian
    Vermeiren> distribution, in stead of minimizing the squared
    Vermeiren> residus SUM (Y-Yhat)^2, the objective function is
    Vermeiren> now SUM log(1+(Y-Yhat)^2/ RobustSD)
 
    Vermeiren> where RobustSD is the 68th percentile of the
    Vermeiren> absolute value of the residues

 
    Vermeiren> my question is: is there a smart and elegant way
    Vermeiren> to change to objective function from squared
    Vermeiren> Distance to log(1+D^2/Rsd^2) ?

no; not easily.
 
    Vermeiren> or alternatively to write this as a weighted
    Vermeiren> non-linear regression where the weights are
    Vermeiren> recalculated during the iterations in nlme it is
    Vermeiren> possible to specify weights, possibly that is the
    Vermeiren> way to do it, but I didn't manage to get it
    Vermeiren> working the weights should then be something
    Vermeiren> like:
 
    Vermeiren> SUM (log(1+(resid(.)/quantile(all_residuals,0.68))^2))
    Vermeiren>   / SUM (resid(.))
 
rnls() mentioned does use robust weights and IRLS (iteratively
reweighted LS) making use of  nls() and rlm(),
similarly to your suggestion.

    Vermeiren> the test data I use :

    Vermeiren> x<-seq(-5,-2,length=50)
    Vermeiren> x<-rep(x,4)
    Vermeiren> y<-SSfpl(x,0,100,-3.5,1)
    Vermeiren> y<-y+rnorm(length(y),sd=5)
    Vermeiren> y[sample(1:length(y),floor(length(y)/50))]<-200 # add 2% outliers at 200

Since you have only outliers in 'y' and none in 'x',
you could use the 'nlrq' (nonlinear regression quantiles)
package that Roger Koenker mentioned.

To really robustify such self starting models as the 4-parameter
logistic 'SSfpl' above, you would also need to provide a robust
initial estimator; 
maybe that could be done pretty easily 'rlm()' instead of 'lm()' and
using 'rnls()' instead of 'nls()' also for the "initial" part in
something like

SSfpl.rob <-
selfStart(~ A + (B - A)/(1 + exp((xmid - input)/scal)), 
          initial = function( ...) { ...... },
          parameters= c("A","B","xmid","scal"))
 
{look at 'SSfpl() for the initial estimator}.

However, BTW, currently the "plinear" version fails for our robust
nonlinear procedure 'rnls()'.

Martin Maechler, ETH Zurich




More information about the R-help mailing list