# [R] Specify a correct formula in R for Piecewise Linear Functions?

Charles C. Berry cberry at tajo.ucsd.edu
Wed Jan 2 19:50:58 CET 2008

```On Thu, 3 Jan 2008, zhijie zhang wrote:

>  Some developments with confusions. I tried the spline method and dummy
> variable approach to do it. But their results are very different. See
> following.
>

[volumes of output and gratuitous SAS code deleted]

> Q1: Why are these two methods so different for the results, e.g. the
> coefficients?

For the same reason that Thomas replied to my email suggesting a different
approach than the one I showed you. viz. The spline basis differs from the
basis vectors he constructed.

>
> Q2: The spline method is useful for piecewise linear functions, e.g.
> bs(distance_trans,degree=1,knots=c(13,25)),
> but how should i do if i want to fit a linear function for the case the
> distance_trans<13,and quadratic curve when distance_trans>=13?
> "bs(distance_trans,degree=c(1,2),knots=13)" cannot works. And even for more
> than three parts. <13,13~25, >25.
>

Whew! My response would be "don't go there". Fit a richer basis than you
need and use penalization to damp out unneeded variation in the fit. Or
use GAMs.

But if you feel you must, you can construct things like

bs( pmax( 13, pmin( 25 , x ) )

> Q3:"fit <- glm( y ~ pmax(x,20)+pmin(x,20), family=binomial)" is good. But if
> i divide x into three or more parts, how should i specify it in this way?
>

As above.

> Hope somone can help.Thanks a lot.
>

You can help youself a lot by taking a few minutes to learn to do in R
what you did in SAS. Reading the help pages AND running the examples is
often illuminating. For example,

example( pmin )

should give you some helpful hints.

HTH,

Chuck

>
>
> On Jan 2, 2008 11:58 PM, Thomas Lumley <tlumley at u.washington.edu> wrote:
>
>> On Tue, 1 Jan 2008, Charles C. Berry wrote:
>>> On Tue, 1 Jan 2008, zhijie zhang wrote:
>>>
>>>> Dear all,
>>>>  I have two variables, y and x. It seems that the relationship between
>> them
>>>> is Piecewise Linear Functions. The cutpoint is 20. That is, when x<20,
>> there
>>>> is a linear relationship between y and x; while x>=20, there is another
>>>> different linear relationship between them.
>>>> How can i specify their relationships in R correctly?
>>>> # glm(y~I(x<20)+I(x>=20),family = binomial, data = point)  something
>> like
>>>> this?
>>>
>>> Try this:
>>>
>>>> library(splines)
>>>> fit <- glm( y ~ bs( x, deg=1, knots=20 ), family=binomial)
>>>
>>
>> In the linear case I would actually argue that there is a benefit from
>> constructing the spline basis by hand, so that you know what the
>> coefficients mean. (For quadratic and higher order splines I agree that
>> pre-existing code for the B-spline basis makes a lot more sense).
>>
>> For example, in
>>   fit <- glm( y ~ pmax(x,20)+pmin(x,20), family=binomial)
>> the coefficients are the slope when is < 20 and the slope when x>20.
>>
>>        -thomas
>>
>>
>> Thomas Lumley                   Assoc. Professor, Biostatistics
>> tlumley at u.washington.edu        University of Washington, Seattle
>>
>
>
>
> --
> With Kind Regards,
>
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
> [***********************************************************************]
> Zhi Jie,Zhang ,PHD
> Tel:+86-21-54237149
> Dept. of Epidemiology,School of Public Health,Fudan University
> Postcode:200032
> Email:epistat at gmail.com
> Website: www.statABC.com
> [***********************************************************************]
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
>

Charles C. Berry                            (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

```