[R] natural splines

Eilers, P. (MStat) P.Eilers at lumc.nl
Fri May 9 12:02:43 CEST 2003


Indeed truncated power functions (TPF) are very useful to compute B-splines.
But why take chances with their bad numerical condition? Simple differences
will
turn the TPF basis into a B-spline basis. See the functions at the end. Try

 B = bbase(1:100, 0, 101, 10, 3)
 matplot(B, type = 'b')

to see it work.

Here I use equally spaced knots (as I always do, adding a penalty to tune
smoothness and avoid singularity problems). But with divided differences
this scheme will work for arbitrary knots (if no knots are the same).

Paul Eilers
Department of Medical Statistics
Leiden University Medical Centre


======= R functions ===================

tpower <- function(x, t, p) {
# Truncated p-th power function
    (x - t) ^ p * (x > t)
}

bbase <- function(x, xl, xr, ndx, deg){
# Construct B-spline basis from differences of truncated power functions
# Input
#   x:      x for which to compute basis
#   xl, xr: left and right boundary
#   nseg:   number of B-spline segments between xl and xr
#   deg:    degree (of the polynomial segments) of the B-spline
#
# Paul Eilers, 2003

    dx <- (xr - xl) / ndx
    knots <- seq(xl - deg * dx, xr + deg * dx, by = dx)
    P <- outer(x, knots, tpower, deg)
    n <- dim(P)[2]
    D <- diff(diag(n), differences = deg + 1) / (gamma(deg + 1) * dx ^ deg)
    B <- (-1) ^ (deg + 1) * P %*% t(D)
    B
}

==========================================


On Thu, 8 May 2003 15:07:56 +0100 (BST)
iwhite at staffmail.ed.ac.uk wrote:

> Apologies if this is this too obscure for R-help.
> 
> In package splines, ns(x,,knots,intercept=TRUE) produces an n by K+2
> matrix N, the values of K+2 basis functions for the natural splines with K
> (internal) knots, evaluated at x.  It does this by first generating an
> n by K+4 matrix B of unconstrained splines, then postmultiplying B by
> H, a K+4 by K+2 representation of the nullspace of C (2 by K+4), which
> contains the 2nd derivatives of the unconstrained splines evaluated at
> the boundary knots.  E.g. see Hastie and Tibshirani, Generalized Additive
> Models, exercise 2.5, p36.  The QR decomposition is used to get H.
> 
> This can produce basis functions which, while technically correct (they
> span the K+2 dim space of natural splines), can be counterintuitive.
> E.g. equally spaced knots symmetrically placed between the data extremes
> can produce very asymmetric arrangements, with N(K+2) not the mirror image
> of N(1), for example, and considerable loss of sparseness.
> 
> This approach works for any basis B, but for B-splines, the second
> derivatives are zero for all the unconstrained basis functions, apart
> from 3 at each end. All that is required is to combine these 3 so that
> the contributions to the 2nd derivatives cancel. In other words, we
> only need to find the null space of two 2 by 3 matrices, rather than a
> 2 by K+4. If the left-most internal knots are t(1) and t(2), and the
> left-hand boundary knot is t(0), we can replace B(1...3) with
> 
> B(1)+B(2)+B(3) and [t(2)-t(0)]*B(2) + [t(1)+t(2)-2*t(0)]*B(1),
> 
> (for example), and similarly at the right-hand end.
> 
> This seems simpler and more elegant than brute force QR on the full
> matrix of derivatives. But I may have missed some reason why it can't be
> used. Perhaps it doesn't work when intercept=FALSE?
> 
> ======================================
> I.White
> ICAPB, University of Edinburgh
> Ashworth Laboratories, West Mains Road
> Edinburgh EH9 3JT
> Fax: 0131 650 6564  Tel: 0131 650 5490
> E-mail: iwhite at staffmail.ed.ac.uk
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

This is also a vote for using the truncated power basis which is extremely
simple and exceptionally fast for large datasets (see the rcspline.eval
function in the Hmisc package).  With modern matrix arithmetic (as in S),
the collinearity of the bases produced by these simple regression splines is
a moot point.
---
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list