[R] Add a function in rq

Tue Sep 20 15:38:18 CEST 2011

On Sep 20, 2011, at 4:50 AM, mael wrote:

> Hi,
>
> I am trying to add a function in a linear quantile regresion to find a
> breakpoint. The function I want to add is:
>
> y=(k+ax)(x<B)+(k+(a-d)B+dx)(x>B)
>
> How do I write it in the rq() function? Do I need to define the  
> parameters
> in any way and how do I do that? I'm a biologist new to R.

You have not offered a test dataset, so no tested example from me but  
why not use an indicator term in a formula:

(You also have not defined your model very clearly. What are the roles  
of the different terms? There appear to be some superfluous constants.  
The (a-d) expression would not fit into regression formula conventions  
very well. (Either that or a linear model is not what you are asking  
about.)  This is just a wild guess:

You will get an intercept term by default which I take to be the "k"  
term in your expression. I doubt the final form will be exactly as you  
expected, but it may be able to recast it in you preferred form (once  
you make clear what those terms mean). The estimate for the first term  
should come out as an interaction with an x-term and an I(B<x) term.  
The estimate for the second term should likewise yield two  
coefficients. So (I thought) you should get 5 coefficients ...but   
When I did it on a simple linear regression model I got 6, so I guess  
my understanding of how many are linearly dependent was wrong.

 > lm(y ~ x*I(x < 1) + dx*I(x >1), data=dat)

Call:
lm(formula = y ~ x * I(x < 1) + dx * I(x > 1), data = dat)

Coefficients:
     (Intercept)                x     I(x < 1)TRUE               dx
           2.920            2.185          -25.026           49.833
    I(x > 1)TRUE   x:I(x < 1)TRUE  dx:I(x > 1)TRUE
              NA          -33.605          -50.870

If that "dx" term is from a differential equation, then scratch all of  
the above and instead use software appropriate to the task or perhaps  
use the diff() function to create a delta-x. Or if you are trying to  
do the very non-linear operation of estimating the B coefficient,  
which is not "linear quantile regression". "Linear" in the regression  
context means linear in the coefficients and the expression (x<B)  
would be non-linear when B is a coefficient to be estimated. In his  
book "Quantile regression Koenker says changepoints can be estimated  
with  the nonparametric methods offered in `rqss`

I did try the above formula in package quantreg, but I get an error  
from a singular design matrix:

require(quantreg)
rq( y ~ x*I(x < 1) + dx*I(x > 1), data=dat)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix

So `rq` seems to be less capable of identifying and discarding  
dependencies in the design matrix than is `lm`.

I think some exploratory analysis might be your best first choice,  
using lowess. To see many types of regression methods illustrated you  
might want to look at Vincent Zoonekynd pages:
http://zoonek2.free.fr/UNIX/48_R/10.html

(it's too bad Vincent still has quantreg examples on his TODO list,  
but he does illustrate from package segmented and uses base `lowess`  
and a couple of other to show the exploratory methods in action.)

And finishing back at rq with e advice to use exploratory methods as  
in Koenker's example on p 16 of
http://cran.r-project.org/web/packages/quantreg/vignettes/rq.pdf

... where he used a 15-knot spline function inside an rq call.

--

David Winsemius, MD
West Hartford, CT