[R] where are my pspline knots?

Spencer Graves spencer.graves at structuremonitoring.com
Thu Nov 18 12:50:11 CET 2010


       I hope that someone more knowledgeable will confirm or correct 
what I'm about to say.  I don't have time now to check this by studying 
the "pspline" code.

       From reading the pspline{survival} help page, I believe it uses 
standard B-splines, "penalising the integrated second derivative", with 
"theta" being the penalty or weight used to multiply the integral of the 
square of the second derivative before adding it to the AIC to be 
minimized.  The help page also suggests that the basis functions (i.e., 
the knots) are evenly spaced, though I'm not completely sure about any 
of this.

       A basic rule for B-splines is as follows:

             nbasis = order + n.knots

where n.knots = the number of interior knots, nbasis = the number of 
basis functions, and order = the order of the spline = one more than the 
degree of the polynomial segments.  With the default cubic splines, 
degree = 3, so order = 4.

       In a fit, nbasis = the number of coefficients estimated, I think.

       Consider the following modification of one of the examples in 

  coef(coxph(Surv(time, status) ~ ph.ecog + pspline(age,3), cancer))
   ph.ecog  ps(age)2  ps(age)3  ps(age)4  ps(age)5  ps(age)6  ps(age)7  
0.4480154 0.3221152 0.6470688 0.9342460 1.0289846 0.9639187 0.9004191 
  ps(age)9 ps(age)10 ps(age)11
1.2422085 1.6403940 2.0785682

       This fit includes one coefficient for ph.ecog and 10 for 
pspline(age,3).  I infer from this that there are 10 basis functions.  
Since degree = 3 (the default), order = 4, so the number of interior 
knots = 6 ( = 10-4).  To find where they are located, I would guess that 
they are evenly spaced in range(cancer$age) = (39, 82):

seq(39, 82, length=8)
[1] 39.00000 45.14286 51.28571 57.42857 63.57143 69.71429 75.85714 82.00000

       HOWEVER, these are only guesses;  I hope someone else will either 
confirm what I've said or correct any errors.

       It would help to have a function "as.fd.coxph.penal" in the "fda" 
package, as "fda" includes functions to help clarify this question:  
"knots.fd" extracts the knots from an object of class "fd", answering 
Federrico's question directly.  Unfortunately, I don't have time to 
write such.  It also includes a function "TaylorSpline" for converting a 
standard B-spline into the coefficients of a Taylor series approximation 
for each segment of the spline.  Chapter 3 in  Ramsay, Hooker, and 
Graves (2009) Functional Data Analysis with R and Matlab (Springer) 
includes code for plotting a B-spline basis set, and the "fda" package 
includes script files for working all but one of the 76 examples in that 
book.  You can find that script file without obtaining the book as follows:

fdaDir <- system.file(package='fda')
fdaDirs <- dir(fdaDir, full=TRUE)
fdaScripts <- grep('scripts$', fdaDirs, value=TRUE)
fdaCh <- dir(fdaScripts, full=TRUE)
fdaCh3 <- grep('fdarm-ch03.R$', fdaCh, value=TRUE)

       You can then make a local copy of "fdaCh3" and work through it 
line by line to see plots of the B-spline basis.  This would help 
Federico more if "fda" included a "as.fd.coxph.penal" function, because 
then he could plot the spline basis used in his example.

       Hope this helps.

On 11/17/2010 11:39 AM, David Winsemius wrote:
> On Nov 17, 2010, at 1:05 PM, Federico Calboli wrote:
>> Hi All,
>> I am trying to figure out how to get the position of the knots in a 
>> pspline used in a cox model.
> As far as I understand it, the term "knot" should be used with natural 
> splines, but not for penalized smoothing splines. See p 124 of 
> Therneau and Gramsch for better description and illustrations. If you 
> want to see the weighting of a particular pspline then this example 
> may help:
> nn=rnorm(20)
> matrix(pspline(nn), nrow=20)[order(nn), ]
> # sorts the spline basis weights so the overlap is "visible"
>> my.model = coxph(Surv(agein, ageout, status) ~ pspline(x), mydata) # 
>> x being continuous
>> How do I find out where the knot of the spline are? I would like to 
>> know to figure out how many cases are there between each knot.
>> Best,
>> Federico
>> -- 
>> Federico C. F. Calboli
>> Department of Epidemiology and Biostatistics
>> Imperial College, St. Mary's Campus
>> Norfolk Place, London W2 1PG
>> Tel +44 (0)20 75941602   Fax +44 (0)20 75943193
>> f.calboli [.a.t] imperial.ac.uk
>> f.calboli [.a.t] gmail.com
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list