[R] Prediction intervals (i.e. not CI of the fit) for monotonic loess curve using bootstrapping

Roger Koenker rkoenker at illinois.edu
Wed Aug 13 17:09:16 CEST 2014


To follow up on David's suggestion on this thread,  I might add that the demo(predemo)
in my quantreg package illustrates a variety of approaches to prediction intervals for
quantile regression estimates.  Adapting this to monotone nonparametric estimation 
using rqss() or cobs would be quite straightforward, although the theory for such bands
is rather difficult and still under construction.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801

On Aug 13, 2014, at 9:59 AM, Jan Stanstrup <jan.stanstrup at fmach.it> wrote:

> Thanks to all of you for your suggestions and comments. I really 
> appreciate it.
> 
> Some comments to Dennis' comments:
> 1) I am not concerned about predicting outside the original range. That 
> would be nonsense anyway considering the physical phenomenon I am 
> modeling. I am, however, concerned that the bootstrapping leads to 
> extremely wide CIs at the extremes of the range when there are few data 
> points. But I guess there is not much I can do about that as long as I 
> rely on bootstrapping?
> 
> 2) I have made a function that does the interpolation to the requested 
> new x's from the original modeling data to get the residual variance and 
> the model variance. Then it interpolates the combined SDs back the the 
> new x values. See below.
> 
> 3) I understand that. For this project it is not that important that the 
> final prediction intervals are super accurate. But I need to hit the 
> ballpark. I am only trying to do something that doesn't crossly 
> underestimate the prediction error and doesn't make statisticians loose 
> their lunch a first glance.
> I also cannot avoid that my data contains erroneous values and I will 
> need to build many models unsupervised. But the fit should be good 
> enough that I plan to eliminate values outside some multiple of the 
> prediction interval and then re-calculate. And if the model is not good 
> in any range I will throw it out completely.
> 
> 
> Based on the formula of my last message I have made a function that at 
> least gives less optimistic intervals than what I could get with the 
> other methods I have tried. The function and example data can be found 
> here 
> https://github.com/stanstrup/retpred_shiny/blob/master/retdb_admin/make_predictions_CI_tests.R 
> in case anymore has any comments, suggestions or expletives to my 
> implementation.
> 
> 
> ----------------------
> Jan Stanstrup
> Postdoc
> 
> Metabolomics
> Food Quality and Nutrition
> Fondazione Edmund Mach
> 
> 
> 
> On 08/12/2014 05:40 PM, Bert Gunter wrote:
>> PI's of what? -- future individual values or mean values?
>> 
>> I assume quantreg provides quantiles for the latter, not the former.
>> (See ?predict.lm for a terse explanation of the difference). Both are
>> obtainable from bootstrapping but the details depend on what you are
>> prepared to assume. Consult references or your local statistician for
>> help if needed.
>> 
>> -- Bert
>> 
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> (650) 467-7374
>> 
>> "Data is not information. Information is not knowledge. And knowledge
>> is certainly not wisdom."
>> Clifford Stoll
>> 
>> 
>> 
>> 
>> On Tue, Aug 12, 2014 at 8:20 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>>> On Aug 12, 2014, at 12:23 AM, Jan Stanstrup wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am trying to find a way to estimate prediction intervals (PI) for a monotonic loess curve using bootstrapping.
>>>> 
>>>> At the moment my approach is to use the boot function from the boot package to bootstrap my loess model, which consist of loess + monoproc from the monoproc package (to force the fit to be monotonic which gives me much improved results with my particular data). The output from the monoproc package is simply the fitted y values at each x-value.
>>>> I then use boot.ci (again from the boot package) to get confidence intervals. The problem is that this gives me confidence intervals (CI) for the "fit" (is there a proper way to specify this?) and not a prediction interval. The interval is thus way too optimistic to give me an idea of the confidence interval of a predicted value.
>>>> 
>>>> For linear models predict.lm can give PI instead of CI by setting interval = "prediction". Further discussion of that here:
>>>> http://stats.stackexchange.com/questions/82603/understanding-the-confidence-band-from-a-polynomial-regression
>>>> http://stats.stackexchange.com/questions/44860/how-to-prediction-intervals-for-linear-regression-via-bootstrapping.
>>>> 
>>>> However I don't see a way to do that for boot.ci. Does there exist a way to get PIs after bootstrapping? If some sample code is required I am more than happy to supply it but I thought the question was general enough to be understandable without it.
>>>> 
>>> Why not use the quantreg package to estimate the quantiles of interest to you? That way you would not be depending on Normal theory assumptions which you apparently don't trust. I've used it with the `cobs` function from the package of the same name to implement the monotonic constraint. I think there is a worked example in the quantreg package, but since I bought Koenker's book, I may be remembering from there.
>>> --
>>> 
>>> David Winsemius
>>> Alameda, CA, USA
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list