[R] the meaning of the B-spline coefficients

Berton Gunter gunter.berton at gene.com
Thu Feb 2 18:43:53 CET 2006


Check out:

**The Elements of Statistical Learning ** by Hastie, Tibshirani, and
Friedman.

-- Bert Gunter
 Genentech

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of liufang at uchicago.edu
Sent: Thursday, February 02, 2006 8:48 AM
To: r-help at stat.math.ethz.ch
Subject: [R] the meaning of the B-spline coefficients

Dear all,

I'm trying to figure out the exact meaning of the B-spline
coefficients generated by the R command bs(). After reading a
lot of things, I still have no clue...

Here's my data.

> test
    time        f0
1      1  94.76328
2      2 102.47954
3      3 105.01234
4      4 107.21387
5      5 108.63279
6      6 109.54507
7      7 113.87931
8      8 118.21356
9      9 121.08652
10    10 121.78338
11    11 118.84742
12    12 112.15230
13    13  99.64756
14    14  85.87430
15    15  80.15959
16    16  78.16951
17    17  76.85120
18    18  76.64255
19    19  75.23007
20    20  74.18679
21    21  97.82914
22    22  97.99156
23    23  98.24108
24    24  99.96225
25    25 100.91948
26    26 101.75905
27    27 114.88339
28    28 125.78792
29    29 130.62168
30    30 132.42147
31    31 120.75498
32    32 116.46438
33    33  95.83809
34    34  83.55815
35    35  83.49363
36    36  83.42912
37    37  83.43273
38    38  83.49382
39    39  83.55078
40    40  83.55078
41    41  89.22781
42    42  93.01460
43    43  94.13982
44    44  95.12909
45    45  97.24925
46    46 100.00507
47    47 108.08150
48    48 115.54357
49    49 126.74814
50    50 127.63650
51    51 123.09723
52    52 115.97800
53    53 107.58863
54    54  99.78626
55    55  90.47310
56    56  81.92469
57    57  79.50943
58    58  75.78710
59    59  73.05736
60    60  72.26699
61    61  93.12932
62    62  91.30452
63    63  91.02817
64    64  91.16687
65    65  93.74704
66    66  96.39891
67    67  99.64934
68    68 104.37769
69    69 110.45508
70    70 111.70428
71    71  93.69037
72    72  85.67118
73    73  85.06033
74    74  84.44947
75    75  83.83862
76    76  82.93448
77    77  80.80928
78    78  78.70249
79    79  78.70249
80    80  78.70249
81    81 140.00112
82    82 139.98659
83    83 142.49656
84    84 145.00654
85    85 147.25728
86    86 149.06518
87    87 151.23441
88    88 156.06892
89    89 160.21311
90    90 162.04904
91    91 124.28610
92    92  86.27715
93    93  69.96150
94    94  70.23389
95    95  74.23542
96    96  78.23695
97    97  82.23848
98    98  86.24001
99    99  92.06214
100  100 114.89530

Here's my R output.
> bsp = lm(f0~bs(time,df=13),data=test)
> summary(bsp)

Call:
lm(formula = f0 ~ bs(time, df = 13), data = test)

Residuals:
     Min       1Q   Median       3Q      Max 
-31.6519  -7.1230   0.1433   6.1755  25.2094 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)           97.307     10.343   9.408 7.26e-15 ***
bs(time, df = 13)1     5.029     20.150   0.250 0.803490    
bs(time, df = 13)2    50.806     14.396   3.529 0.000672 ***
bs(time, df = 13)3   -66.700     15.579  -4.281 4.81e-05 ***
bs(time, df = 13)4    73.981     13.516   5.474 4.29e-07 ***
bs(time, df = 13)5   -59.803     14.225  -4.204 6.40e-05 ***
bs(time, df = 13)6    46.817     13.740   3.407 0.001000 ***
bs(time, df = 13)7   -23.807     13.982  -1.703 0.092235 .  
bs(time, df = 13)8     8.090     13.889   0.582 0.561776    
bs(time, df = 13)9   -22.132     14.100  -1.570 0.120170    
bs(time, df = 13)10   16.759     14.566   1.151 0.253083    
bs(time, df = 13)11   98.630     16.566   5.954 5.55e-08 ***
bs(time, df = 13)12 -102.236     16.784  -6.091 3.06e-08 ***
bs(time, df = 13)13   31.919     14.629   2.182 0.031839 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 12.46 on 86 degrees of freedom
Multiple R-Squared: 0.7316,	Adjusted R-squared: 0.691 
F-statistic: 18.03 on 13 and 86 DF,  p-value: < 2.2e-16 
 

Do the above coefficients imply the feature of the independent
variable "f0"? For example, does "intercept" approximate the
initial value of "f0" at "time" 1? I specified df=13, so there
are 10 knots in this case. Do bs(time, df = 13)4 through
bs(time, df = 13)13 indicate the slope or intercept of the
original curve (of "f0") within the 10 knot spans? What's the
meaning of bs(time, df = 13)1 - 3 then?

I'm reading "B(asic)-Spline Basics" by Carl de Boor, but
really don't understand those formula of splines. I'd really
appreciate it if someone could help me with this.

Many thanks!

Fang Liu
liufang at uchicago.edu
Ph.D. student
Department of linguistics
University of Chicago




More information about the R-help mailing list