[R] where are my pspline knots?
Spencer Graves
spencer.graves at structuremonitoring.com
Thu Nov 18 12:50:11 CET 2010
Hello:
I hope that someone more knowledgeable will confirm or correct
what I'm about to say. I don't have time now to check this by studying
the "pspline" code.
From reading the pspline{survival} help page, I believe it uses
standard B-splines, "penalising the integrated second derivative", with
"theta" being the penalty or weight used to multiply the integral of the
square of the second derivative before adding it to the AIC to be
minimized. The help page also suggests that the basis functions (i.e.,
the knots) are evenly spaced, though I'm not completely sure about any
of this.
A basic rule for B-splines is as follows:
nbasis = order + n.knots
where n.knots = the number of interior knots, nbasis = the number of
basis functions, and order = the order of the spline = one more than the
degree of the polynomial segments. With the default cubic splines,
degree = 3, so order = 4.
In a fit, nbasis = the number of coefficients estimated, I think.
Consider the following modification of one of the examples in
help('pspline'):
coef(coxph(Surv(time, status) ~ ph.ecog + pspline(age,3), cancer))
ph.ecog ps(age)2 ps(age)3 ps(age)4 ps(age)5 ps(age)6 ps(age)7
ps(age)8
0.4480154 0.3221152 0.6470688 0.9342460 1.0289846 0.9639187 0.9004191
1.0119748
ps(age)9 ps(age)10 ps(age)11
1.2422085 1.6403940 2.0785682
This fit includes one coefficient for ph.ecog and 10 for
pspline(age,3). I infer from this that there are 10 basis functions.
Since degree = 3 (the default), order = 4, so the number of interior
knots = 6 ( = 10-4). To find where they are located, I would guess that
they are evenly spaced in range(cancer$age) = (39, 82):
seq(39, 82, length=8)
[1] 39.00000 45.14286 51.28571 57.42857 63.57143 69.71429 75.85714 82.00000
HOWEVER, these are only guesses; I hope someone else will either
confirm what I've said or correct any errors.
It would help to have a function "as.fd.coxph.penal" in the "fda"
package, as "fda" includes functions to help clarify this question:
"knots.fd" extracts the knots from an object of class "fd", answering
Federrico's question directly. Unfortunately, I don't have time to
write such. It also includes a function "TaylorSpline" for converting a
standard B-spline into the coefficients of a Taylor series approximation
for each segment of the spline. Chapter 3 in Ramsay, Hooker, and
Graves (2009) Functional Data Analysis with R and Matlab (Springer)
includes code for plotting a B-spline basis set, and the "fda" package
includes script files for working all but one of the 76 examples in that
book. You can find that script file without obtaining the book as follows:
fdaDir <- system.file(package='fda')
fdaDirs <- dir(fdaDir, full=TRUE)
fdaScripts <- grep('scripts$', fdaDirs, value=TRUE)
fdaCh <- dir(fdaScripts, full=TRUE)
fdaCh3 <- grep('fdarm-ch03.R$', fdaCh, value=TRUE)
You can then make a local copy of "fdaCh3" and work through it
line by line to see plots of the B-spline basis. This would help
Federico more if "fda" included a "as.fd.coxph.penal" function, because
then he could plot the spline basis used in his example.
Hope this helps.
Spencer
On 11/17/2010 11:39 AM, David Winsemius wrote:
>
> On Nov 17, 2010, at 1:05 PM, Federico Calboli wrote:
>
>> Hi All,
>>
>> I am trying to figure out how to get the position of the knots in a
>> pspline used in a cox model.
>
> As far as I understand it, the term "knot" should be used with natural
> splines, but not for penalized smoothing splines. See p 124 of
> Therneau and Gramsch for better description and illustrations. If you
> want to see the weighting of a particular pspline then this example
> may help:
>
> nn=rnorm(20)
> matrix(pspline(nn), nrow=20)[order(nn), ]
> # sorts the spline basis weights so the overlap is "visible"
>>
>> my.model = coxph(Surv(agein, ageout, status) ~ pspline(x), mydata) #
>> x being continuous
>>
>> How do I find out where the knot of the spline are? I would like to
>> know to figure out how many cases are there between each knot.
>>
>> Best,
>>
>> Federico
>>
>> --
>> Federico C. F. Calboli
>> Department of Epidemiology and Biostatistics
>> Imperial College, St. Mary's Campus
>> Norfolk Place, London W2 1PG
>>
>> Tel +44 (0)20 75941602 Fax +44 (0)20 75943193
>>
>> f.calboli [.a.t] imperial.ac.uk
>> f.calboli [.a.t] gmail.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list