[R] breakpoints and nonlinear regression

Tue Jan 17 19:34:17 CET 2012

On Tue, 17 Jan 2012, Bert Gunter wrote:

> On Tue, Jan 17, 2012 at 8:06 AM, Kenneth Frost <kfrost at wisc.edu> wrote:
>> Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored.
>
> Yes, because the fitted values are nonlinear in the parameters, which
> makes finding exact confidence regions impossible. I think the "usual"
> approach (subject to correction by experts) is to use a delta method
> approximation for the fitted variances from the varcov matrix of the
> parameters at the converged optimum (itself an approximation) and then
> a standard t-interval  based on that. However, this approximation can
> be quite bad, because "degrees of freedom" don't mean much for
> nonlinear models -- in fact, that's the essential (and huge!)
> difference between linear and nonlinear models -- and the likelihood
> surface may not be close enough to quadratic. So one may do better
> with, e.g. a bootstrap approximation, although this can be
> problematic, too, due to convergence and other issues.
>
> What I think can be said with some certainty is that the idea of
> approximating by a segmented regression and then using CI's for each
> linear part in the "usual" way is a particularly bad one -- the CI's
> will be underestimated because they don't take into account the
> uncertainty in the location of the fitted breakpoints, which are
> nonlinear **and** non-smooth functions of the data.
>
> So if confidence intervals for the fitted values are really important,
> I suggest that Julian work with his local statistician to come up with
> the best approach for his particular situation. It's tricky.

I fully agree with Bert that, in this case, segmented regression does not 
seem to be a fruitful approach and that it's best to consult a local
statistician.

However, I just wanted to clarify a theoretical detail about what 
breakpoints() does. The breakpoints converge at the faster rate of "n" 
while the parameter estimates just converge with "sqrt(n)". This is why in 
principle, it is possible to get "the usual" inference from segmented 
regressions. The price for this is to assume that the true model is in 
fact a segmented regression (with only breakpoints/coefficients unknown).

Hence, segmented regression will be "useful" (in the Tukey 
sense) if there are few relatively abrupt changes in a regression 
relationship. On the other hand, for approximating smooth changes there 
are typically better techniques available.

Best,
Z

> Cheers,
> Bert
>
>>
>> On 01/17/12, crimsonengineer87  <julianjonreyes at gmail.com> wrote:
>>> Dear Forum,
>>>
>>> I have been wracking my head over this problem for the past few days. I have
>>> a dataset of (x,y). I have been able to obtain a nonlinear regression line
>>> using nls. However, we would like to do some statistical analysis. I would
>>> like to obtain a confidence interval for the curve. We thought we could
>>> divide up the curve into piecewise linear regressions and compute CIs from
>>> those portions. There is a package called strucchange that seems helpful,
>>> but I am thoroughly confused.
>>>
>>> 'breakpoints' is used to calculate the number of breaks in the data for
>>> linear regressions.  I have the following in my script:
>>>
>>> bp.pavlu <- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3,
>>> data=pavludata)
>>> plot(bp.pavlu)
>>> breakpoints(bp.pavlu)
>>>
>>> But I am confused as to how to graph the piecewise functions that make up
>>> the curve. I am not even sure if I am using breakpoints correctly. Do I just
>>> give it a linear relationhip (Na ~ yield), instead of what I have?
>>>
>>> Is there an easier way to calculate the confidence interval for a non-linear
>>> regression?
>>>
>>> I am new to R (as I've read in many questions), but I have most certainly
>>> tried many things and am just getting frustrated with the lack of examples
>>> for what I'd like to do with my data... I'd appreciate any insight. I can
>>> also provide more information if I am not clear. Thanks in advance.
>>>
>>> Julian
>>>
>>> --
>>> View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> -- 
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>