[R] NaN causes "error in fitter" with cph.calibrate from pkg Design

Mon Nov 3 16:38:54 CET 2008

I have been  attempting  to use cph models to get better calibration  
of my models for which I had originally used logistic regression. I  
tried running with 40 repetitions and got an error. I then tried 500  
repetitions (thinking that the NaNs in the output below might be  
caused by that choice) and then let my computer crunch for several  
hours and got only the same error message and warnings for my output.

Since the dataset is large and has caused memory overflow in the past  
(despite having 4 GB in a UNIX OS), I had previously saved it and  
loaded it back into a clean run of R. The memory overflow did not  
occur with this approach, but it appears that the NaNs at the "high"  
end of the survival probability scale are causing fatal errors with  
the fitter. It appears that for half of the samples, an n of 40 is  
being used while for the other other half an adaptive sampling size is  
being used.

I hope I have included enough detail below. Any thoughts on a way out  
of this? Would the full screen output be useful in construction a  
calibration curve? (I have copied it to a file.) Would there be a way  
of adaptively increasing the number of subjects in the sampling at the  
higher values of x (which I take to be a survival fraction at the  
interval being examined)? Or perhaps there would be a way of  
specifying a larger sampling fraction for all runs? I have gone  
through the documentation and have not identified a parameter that  
would do so.

I tried looking at the code of cph.calibrate, but could not figure out  
how the sample size of 40 was established.

Sincerely;

David Winsemius
---the call------and trimmed output --- and after that, system info  
and str(<model>)

cal <- calibrate( cph.A.S.H.T.G.S.D, m=40, method="boot", B=50, u=4)

Using Cox survival estimates at 1 Days

Averaging  5  repetitions of B= 10

           x      n events     KM std.err
[1,] 0.8766     41     18 0.5261 0.26987
[2,] 0.9359     39     10 0.7453 0.34283
[3,] 0.9416     40      7 0.8247 0.42187
[4,] 0.9466     40      6 0.8610 0.45340
[5,] 0.9499     40      7 0.9041 0.59887
[6,] 0.9531     40      7 0.8337 0.50418
[7,] 0.9552     40      7 0.8536 0.46862

trimmed output  in middle

[54,] 0.9850   1000     84 0.9227 0.12758
[55,] 0.9855     40      1 0.9688 1.00004
[56,] 0.9860   1240     91 0.9349 0.12622
[57,] 0.9865     40      2 0.9500 0.70718
[58,] 0.9870   1520    103 0.9382 0.12124
[59,] 0.9875     40      3 0.9677 1.00004
[60,] 0.9880   1960    129 0.9402 0.10282
[61,] 0.9885     40      1 0.9565 1.00008
[62,] 0.9890   2520    143 0.9530 0.10141
[63,] 0.9895     40      3 0.9750 1.00003
[64,] 0.9900   3320    179 0.9544 0.09349
[65,] 0.9905     40      1 1.0000     NaN
[66,] 0.9910   4200    212 0.9571 0.08377
[67,] 0.9915     40      1 1.0000     NaN
[68,] 0.9920   5921    264 0.9617 0.07440
[69,] 0.9925     40      0 1.0000     NaN
[70,] 0.9930   7960    330 0.9650 0.06637
[71,] 0.9935     40      1 1.0000     NaN
[72,] 0.9940  12160    436 0.9695 0.05722
[73,] 0.9945     40      0 1.0000     NaN
[74,] 0.9950  18281    539 0.9777 0.05360
[75,] 0.9955     40      1 0.9750 1.00003
[76,] 0.9960  30841    666 0.9831 0.04845
[77,] 0.9965     40      1 1.0000     NaN
[78,] 0.9971  55002    951 0.9863 0.03981
[79,] 0.9975     40      0 1.0000     NaN
[80,] 0.9981 114005   1317 0.9909 0.03398
[81,] 0.9985     40      0 1.0000     NaN
[82,] 0.9991 366094   1790 0.9961 0.02874
[83,] 0.9995     40      1 0.9750 1.00003
[84,] 0.9997 253649    518 0.9984 0.05316
[85,] 0.9999     40      0 1.0000     NaN
Error in fitter(X, Y, strata = Strata, offset = offset, weights =  
weights,  :
  NA/NaN/Inf in foreign function call (arg 6)
In addition: There were 50 or more warnings (use warnings() to see the  
first 50)
er(X, Y, strata = Strata, offset = offset, weights = weights,  :

The warnings are of two types. Here are the first ten:

 > warnings()
Warning messages:
1: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
2: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
3: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
4: In train.stat + train.statj ... :
  longer object length is not a multiple of shorter object length
5: In test.stat + test.statj * wt ... :
  longer object length is not a multiple of shorter object length
6: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
7: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
8: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
9: In groupkm(cox, Surv(y[, 1], y[, 2]), u = u, cuts = orig.cuts) ... :
  one interval had < 2 observations
10: In num + (!na) ... :
  longer object length is not a multiple of shorter object length

 > .Platform
$OS.type
[1] "unix"

$file.sep
[1] "/"

$dynlib.ext
[1] ".so"

$GUI
[1] "AQUA"

$endian
[1] "little"

$pkgType
[1] "mac.binary"

$path.sep
[1] ":"

$r_arch
[1] "i386"

 > str(cph.A.S.H.T.G.S.D)
List of 30
$ coefficients     : Named num [1:16]  0.0588  0.0296  0.1848 -0.0436   
0.0838 ...
  ..- attr(*, "names")= chr [1:16] "Age" "Age'" "Sex=Male"  
"BL_HDL.A" ...
$ var              : num [1:16, 1:16]  6.21e-06 -5.83e-06 -1.43e-06   
1.51e-06 -1.03e-06 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:16] "Age" "Age'" "Sex=Male" "BL_HDL.A" ...
  .. ..$ : chr [1:16] "Age" "Age'" "Sex=Male" "BL_HDL.A" ...
$ loglik           : num [1:2] -110975 -105680
$ score            : num 17894
$ iter             : int 9
$ linear.predictors: Named num [1:886393] -0.333 -0.710  1.028 -1.087   
1.127 ...
  ..- attr(*, "names")= chr [1:886393] "1" "2" "3" "4" ...
$ residuals        : Named num [1:886393] -0.01252 -0.00859 -0.04884  
-0.00589 -0.05392 ...
  ..- attr(*, "names")= chr [1:886393] "1" "2" "3" "4" ...
$ means            : num [1:16] 43.472  4.162  0.563 52.815  8.503 ...
$ terms            :Classes 'terms', 'formula' length 3 srv900 ~  
rcs(Age, c(35, 50, 65)) + Sex + rcs(BL_HDL.A, 3) *  
rcs(BL_CHOLEST.A,      3) + rcs(BL_GGT.A, 3) + BL_TRIGLYC.A +  
rcs(BL_GGT.A, 3) +  ...
  .. ..- attr(*, "variables")= language list(srv900, rcs(Age, c(35,  
50, 65)), Sex, rcs(BL_HDL.A, 3),      rcs(BL_CHOLEST.A, 3),  
rcs(BL_GGT.A, 3), BL_TRIGLYC.A, BP.B,  ...
  .. ..- attr(*, "factors")= int [1:9, 1:9] 0 1 0 0 0 0 0 0 0 0 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:9] "srv900" "rcs(Age, c(35, 50, 65))" "Sex"  
"rcs(BL_HDL.A, 3)" ...
  .. .. .. ..$ : chr [1:9] "rcs(Age, c(35, 50, 65))" "Sex"  
"rcs(BL_HDL.A, 3)" "rcs(BL_CHOLEST.A, 3)" ...
  .. ..- attr(*, "term.labels")= chr [1:9] "rcs(Age, c(35, 50, 65))"  
"Sex" "rcs(BL_HDL.A, 3)" "rcs(BL_CHOLEST.A, 3)" ...
  .. ..- attr(*, "specials")=Dotted pair list of 2
  .. .. ..$ strat  : NULL
  .. .. ..$ cluster: NULL
  .. ..- attr(*, "order")= int [1:9] 1 1 1 1 1 1 1 1 2
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<R_GlobalEnv>
$ n                : 'table' int [, 1:2] 877967 8426
  ..- attr(*, "dimnames")=List of 1
  .. ..$ Status: chr [1:2] "No Event" "Event"
$ call             : language cph(formula = srv900 ~ rcs(Age, c(35,  
50, 65)) + Sex + rcs(BL_HDL.A,      3) * rcs(BL_CHOLEST.A, 3) +  
rcs(BL_GGT.A, 3) + BL_TRIGLYC.A +  ...
$ Design           :List of 11
  ..$ name        : chr [1:9] "Age" "Sex" "BL_HDL.A" "BL_CHOLEST.A" ...
  ..$ label       : chr [1:9] "Age" "Sex" "BL_HDL.A" "BL_CHOLEST.A" ...
  ..$ units       : Named chr [1:8] "" "" "" "" ...
  .. ..- attr(*, "names")= chr [1:8] "Age" "Sex" "BL_HDL.A"  
"BL_CHOLEST.A" ...
  ..$ colnames    : chr [1:16] "Age" "Age'" "Sex=Male" "BL_HDL.A" ...
  ..$ assume      : chr [1:9] "rcspline" "category" "rcspline"  
"rcspline" ...
  ..$ assume.code : int [1:9] 4 5 4 4 4 1 1 1 9
  ..$ parms       :List of 6
  .. ..$ Age                    : num [1:3] 35 50 65
  .. ..$ Sex                    : chr [1:2] "Female" "Male"
  .. ..$ BL_HDL.A               : num [1:3] 35 51 73
  .. ..$ BL_CHOLEST.A           : num [1:3] 154 199 253
  .. ..$ BL_GGT.A               : num [1:3] 11 22 55
  .. ..$ BL_HDL.A * BL_CHOLEST.A: num [1:3, 1:5] 3 4 0 0 0 0 1 0 0 0 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : chr [1:5] "parms" "" "" "" ...
  ..$ limits      : list()
  ..$ values      : list()
  ..$ nonlinear   :List of 9
  .. ..$ Age                    : logi [1:2] FALSE  TRUE
  .. ..$ Sex                    : logi FALSE
  .. ..$ BL_HDL.A               : logi [1:2] FALSE  TRUE
  .. ..$ BL_CHOLEST.A           : logi [1:2] FALSE  TRUE
  .. ..$ BL_GGT.A               : logi [1:2] FALSE  TRUE
  .. ..$ BL_TRIGLYC.A           : logi FALSE
  .. ..$ BP.B                   : logi FALSE
  .. ..$ BP.A                   : logi FALSE
  .. ..$ BL_HDL.A * BL_CHOLEST.A: logi [1:4] FALSE  TRUE  TRUE  TRUE
  ..$ interactions: num [1:3, 1] 3 4 0
$ assign           :List of 9
  ..$ rcs(Age, c(35, 50, 65))              : int [1:2] 1 2
  ..$ Sex                                  : int 3
  ..$ rcs(BL_HDL.A, 3)                     : int [1:2] 4 5
  ..$ rcs(BL_CHOLEST.A, 3)                 : int [1:2] 6 7
  ..$ rcs(BL_GGT.A, 3)                     : int [1:2] 8 9
  ..$ BL_TRIGLYC.A                         : int 10
  ..$ BP.B                                 : int 11
  ..$ BP.A                                 : int 12
  ..$ rcs(BL_HDL.A, 3):rcs(BL_CHOLEST.A, 3): int [1:4] 13 14 15 16
$ na.action        :List of 3
  ..$ nmiss             : Named int [1:9] 780 9389 3891 2405 2404 3055  
2677 10623 10328
  .. ..- attr(*, "names")= chr [1:9] "srv900" "Age" "Sex" "BL_HDL.A" ...
  ..$ omit              : Named int [1:24266] 20 29 53 75 82 84 119  
160 201 230 ...
  .. ..- attr(*, "names")= chr [1:24266] "20" "29" "53" "75" ...
  ..$ na.detail.response: NULL
  ..- attr(*, "class")= chr "delete"
$ fail             : logi FALSE
$ non.slopes       : num 0
$ stats            : Named num [1:8] 886393   8426  10590     16       
0 ...
  ..- attr(*, "names")= chr [1:8] "Obs" "Events" "Model L.R." "d.f." ...
$ method           : chr "efron"
$ maxtime          : num 9.95
$ time.inc         : num 1
$ units            : chr "Day"
$ fitFunction      : chr [1:2] "cph" "coxph"
$ center           : num 0.82
$ scale.pred       : chr [1:2] "log Relative Hazard" "Hazard Ratio"
$ x                : num [1:886393, 1:16] 35.1 27.0 56.4 22.5 56.0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:886393] "1" "2" "3" "4" ...
  .. ..$ : chr [1:16] "Age" "Age'" "Sex=Male" "BL_HDL.A" ...
$ y                : Surv [1:886393, 1:2] 9.94661+ 9.90007+ 9.84805+  
9.85900+ 9.86448+ 9.85900+ 9.88364+ 9.85079+ 9.79329+ 9.82615+ ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:886393] "1" "2" "3" "4" ...
  .. ..$ : chr [1:2] "time" "status"
  ..- attr(*, "type")= chr "right"
  ..- attr(*, "units")= chr "Day"
  ..- attr(*, "time.label")= chr "surv.yr"
  ..- attr(*, "event.label")= chr "cens"
$ time             : num [1:2599] 0.00000 0.00000 0.00274 0.00548  
0.00821 ...
$ surv             : atomic [1:2599] 1 1 1 1 1 ...
  ..- attr(*, "type")= chr "efron"
$ std.err          : num [1:2599]     NA 0.0483 0.0482 0.0482 0.0479 ...
$ surv.summary     : num [1:10, 1, 1:3] 1.000 0.999 0.998 0.997  
0.996 ...
  ..- attr(*, "dimnames")=List of 3
  .. ..$ : chr [1:10] "0" "1" "2" "3" ...
  .. ..$ : chr "Stratum 1"
  .. ..$ : chr [1:3] "Survival" "n.risk" "std.err"
- attr(*, "class")= chr [1:3] "cph" "Design" "coxph"