[RsR] Prediction Intervals for Robust Regression

Stromberg, Arnold @@tro11 @end|ng |rom uky@edu
Mon Feb 23 08:19:58 CET 2015


Seems to me that the user would have to decide on the issue of "real" vs "normal" observations for the prediction interval.

Arnold J. Stromberg
Professor and Chair
Department of Statistics
University of Kentucky
313 Multidisciplinary Science Building
725 Rose Street
Lexington, KY 40536-0082
Phone: 859-257-6115
Fax: 859-323-1973

-----Original Message-----
From: Stahel Werner A. [mailto:stahel using stat.math.ethz.ch] 
Sent: Saturday, February 21, 2015 11:51 AM
To: Mächler Martin; Jonathan Burns; Stromberg, Arnold
Cc: mailman, r-sig-robust
Subject: AW: [RsR] Prediction Intervals for Robust Regression

Hello Jonathan

Even though it is straightforward, there is a twist to it: Robust methods are for data with outliers or -- a more sophisticated view -- long tailed distributions. The normal quantiles might be used to encompass "normal" observations -- but the probability of having a real observation in the interval woud be overestimated.
An alternative may be to use an empirical quantile of the standardized residuals with scale equal to the standard deviation obtained from the formula for normal observations. 
I wonder whether this argument has been formally written down in the literature.

Good success!

Werner Stahel
M +41 79 784 9330 | P +41 44 364 6424

________________________________________
Von: R-SIG-Robust [r-sig-robust-bounces using r-project.org]" im Auftrag von "Martin Maechler [maechler using lynne.stat.math.ethz.ch]
Gesendet: Freitag, 20. Februar 2015 17:10
An: Jonathan Burns; Stromberg, Arnold
Cc: mailman, r-sig-robust
Betreff: Re: [RsR] Prediction Intervals for Robust Regression

>>>>> Stromberg, Arnold <astro11 using uky.edu>
>>>>>     on Mon, 16 Feb 2015 20:14:06 +0000 writes:

    > Jonathan,
    > Seems straightforward theoretically, let's see if anyone has implemented them in R.

    > Arny

    > Arnold J. Stromberg
    > Professor and Chair, Department of Statistics, University of Kentucky
    [ ........... ]

Well,  the predict() method for lmrob() fits (package
'robustbase') has built in.
I wonder why nobody has seen that and mentioned it here.

In the mean time, Jonathan has also asked on R-help and got some advice there...
and now has found predict.lmrob  "in some way" and asked me (as 'robustbase' maintainer) about it.

I'm taking the liberty of answering here -- so others are also helped in the future, *and* this thread is somewhat decently closed within the R-SIG-robust list :

>>>>> Burns, Jonathan (NONUS) <Jonathan.Burns1 using GDIT.com>
>>>>>     on Thu, 19 Feb 2015 21:37:26 +0000 writes:

   [..........]

    > I am interested in creating prediction intervals for the robust regression models.  I tried to use the function predict.lmrob(); however, R gave me an error - could not find function "predict.lmrob".  I thought perhaps this was because I was using an older version of the package.  I updated the package and I still get the error.  I am using R version 3.1.0.

    > I also got the same error with the function print.lmrob, plot.lmrob() and anova.lmrob().  Lmrob() itself works fine.

    > This is the result that I get when I list the functions in robustbase:

    >> ls("package:robustbase")
    > [1] "adjbox"              "adjboxStats"         "adjOutlyingness"
    > [4] "aircraft"            "airmay"              "alcohol"
    ......................
    ......................
    > [103] "vaso"                "wagnerGrowth"        "wgt.himedian"
    > [106] "wood"

print(), predict() etc are all generic functions,
their   lmrob S3 methods *are* called  print.lmrob(), predict.lmrob(), etc,
but they are *hidden* and you do not see them normally.

Rather you should use  print(..), predict(...), etc.

If you really need to see them you can use
  getAnywhere("predict.lmrob")
etc.

{This is all general R knowledge - somewhat intermediate level -  about using S3 methods in R packages and namespaces}

Note that you've also asked about this on the R-help mailing list on Feb 11, and you got two answers, the second one by Prof Brian Ripley explained to you  that matters *are* actually more complicated:

If you use robustness for a good reason, it seems a bit optimistic to assume that a future observation has normal errors, (rather than a mixture of normal + "outlier") and so the standard assumptions about prediction intervals would be doubtful.

But I agree (with you I assume) that sometimes you *want* to make this somewhat optimistic assumption.... and for that case, everything is ready for you on a silver plate :

Why did you not just read the help page for predict.lmrob ?
Even though the object is hidden -- because you should call predict() -- it still has nice help page {well, that can be improved, and I will for the next version of robustbase}, and that help
*does* answer your question on how to compute prediction intervals:

Andreas Ruckstuhl, the author of the function does provide them (under the optimistic assumption), in the exact same way as the
predict() method for lm  {called "predict.lm"} does.

==> Just use  predict( <fitted lmrob object>,  interval = "prediction") (or a variant where you specify new data, weights, etc).


Best regards,
Martin


    > -----Original Message-----
    > From: R-SIG-Robust [mailto:r-sig-robust-bounces using r-project.org] On Behalf Of Jonathan Burns
    > Sent: Wednesday, February 11, 2015 12:42 PM
    > To: r-sig-robust using r-project.org
    > Subject: [RsR] Prediction Intervals for Robust Regression

    > I have created robust regression models using least trimmed squares and MM- regression (using robustbase).

    > I am now looking to create prediction intervals for the predicted results.  While I have seen some discussion in the literature about confidence intervals on the estimates for robust regression, I haven't had much success on prediction intervals for the results.  I was wondering anyone would be able to provide some direction on how to create these prediction intervals in the robust regression setting.

    > Thanks,
    > Jonathan Burns
    > Sr. Statistician
    > General Dynamics Information Technology
    > Medicare & Medicaid Solutions
    > One West Pennsylvania Avenue
    > Baltimore, MD 21204
    > Jonathan.Burns1 using gdit.com
    > _______________________________________________
    > R-SIG-Robust using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-sig-robust
    > _______________________________________________
    > R-SIG-Robust using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-sig-robust

_______________________________________________
R-SIG-Robust using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust




More information about the R-SIG-Robust mailing list