[R] piecewise linear regression

David Winsemius dwinsemius at comcast.net
Sat Mar 7 16:30:07 CET 2009


It actually looked reasonably economical but the output certainly is  
ugly. I see a variety of approaches in the r-help archives. This  
thread discusses two other approaches, degree-one splines from Berry  
and hard coded-coefficients from Lumley:

http://finzi.psych.upenn.edu/R/Rhelp08/archive/118046.html

The Lumley solution has the advantage which he articulates that the  
slopes are more directly interpretabel and in this case you can see  
that yourversion's year slope agrees with Lumley's suggested  
parametrization:

 > m=lm(percent~ year + pmax(year,1996) + pmin(year, 1996), weights=1/ 
se,
+ subset=year>=1988, da=d);
 > m

Call:
lm(formula = percent ~ year + pmax(year, 1996) + pmin(year, 1996),      
data = d, subset = year >= 1988, weights = 1/se)

Coefficients:
      (Intercept)              year  pmax(year, 1996)  pmin(year, 1996)
        1161.3126           -0.2177           -0.3494                NA

More compact output to boot.

-- 
David Winsemius

On Mar 7, 2009, at 9:54 AM, David Freedman wrote:

>
> Hi - I'd like to construct and plot the percents by year in a small  
> data set
> (d) that has values between 1988 and 2007.  I'd like to have a  
> breakpoint
> (buy no discontinuity) at 1996.  Is there a better way to do this  
> than in
> the code below?
>
>> d
>   year percent   se
> 1  1988    30.6 0.32
> 2  1989    31.5 0.31
> 3  1990    30.9 0.30
> 4  1991    30.6 0.28
> 5  1992    29.3 0.25
> 6  1994    30.3 0.26
> 7  1996    29.9 0.24
> 8  1998    28.4 0.22
> 9  2000    27.8 0.22
> 10 2001    26.1 0.20
> 11 2002    25.1 0.19
> 12 2003    24.4 0.19
> 13 2004    23.7 0.19
> 14 2005    25.1 0.18
> 15 2006    23.9 0.20
>
> dput(d)
> structure(list(year = c(1988L, 1989L, 1990L, 1991L, 1992L, 1994L,
> 1996L, 1998L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
> 2007L), percent = c(30.6, 31.5, 30.9, 30.6, 29.3, 30.3, 29.9,
> 28.4, 27.8, 26.1, 25.1, 24.4, 23.7, 25.1, 23.9, 23.9), se = c(0.32,
> 0.31, 0.3, 0.28, 0.25, 0.26, 0.24, 0.22, 0.22, 0.2, 0.19, 0.19,
> 0.19, 0.18, 0.2, 0.18)), .Names = c("year", "percent", "se"), class =
> "data.frame", row.names = c(NA,
> -16L))
>
> with(d,plot(year,percent,pch=16,xlim=c(1988,2007)))
> m=lm(percent~ year + I(year-1996):I(year <= 1996), weights=1/se,
> subset=year>=1988, da=d);
> points(d$year,predict(m,dafr(year=d$year)),type='l',lwd=2,col='red')
>
> thanks very much
> David Freedman
>
>
> -- 
> View this message in context: http://www.nabble.com/piecewise-linear-regression-tp22388118p22388118.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list