# [R] natural splines

iwhite@staffmail.ed.ac.uk iwhite at staffmail.ed.ac.uk
Thu May 8 16:07:56 CEST 2003

```Apologies if this is this too obscure for R-help.

In package splines, ns(x,,knots,intercept=TRUE) produces an n by K+2
matrix N, the values of K+2 basis functions for the natural splines with K
(internal) knots, evaluated at x.  It does this by first generating an
n by K+4 matrix B of unconstrained splines, then postmultiplying B by
H, a K+4 by K+2 representation of the nullspace of C (2 by K+4), which
contains the 2nd derivatives of the unconstrained splines evaluated at
the boundary knots.  E.g. see Hastie and Tibshirani, Generalized Additive
Models, exercise 2.5, p36.  The QR decomposition is used to get H.

This can produce basis functions which, while technically correct (they
span the K+2 dim space of natural splines), can be counterintuitive.
E.g. equally spaced knots symmetrically placed between the data extremes
can produce very asymmetric arrangements, with N(K+2) not the mirror image
of N(1), for example, and considerable loss of sparseness.

This approach works for any basis B, but for B-splines, the second
derivatives are zero for all the unconstrained basis functions, apart
from 3 at each end. All that is required is to combine these 3 so that
the contributions to the 2nd derivatives cancel. In other words, we
only need to find the null space of two 2 by 3 matrices, rather than a
2 by K+4. If the left-most internal knots are t(1) and t(2), and the
left-hand boundary knot is t(0), we can replace B(1...3) with

B(1)+B(2)+B(3) and [t(2)-t(0)]*B(2) + [t(1)+t(2)-2*t(0)]*B(1),

(for example), and similarly at the right-hand end.

This seems simpler and more elegant than brute force QR on the full
matrix of derivatives. But I may have missed some reason why it can't be
used. Perhaps it doesn't work when intercept=FALSE?

======================================
I.White
ICAPB, University of Edinburgh