[Rd] Difficult debug
Therneau, Terry M., Ph.D.
therne@u @end|ng |rom m@yo@edu
Wed Feb 7 21:01:44 CET 2024
I've hit a roadblock debugging a new update to the survival package. I do debugging in
a developement envinment, i.e. I don't create and load a package but rather source all
the .R files and dyn.load an .so file, which makes things a bit easier.
Running with R -d "valgrind --tool=memcheck --leak-check=full" one of my test files
crashes in simple R code a dozen lines after the likely culprit has been called, i.e, the
survfit function for an Aalen-Johansen, containing a .Call to the new C code. The
valgrind approach had already allowed me to find a few other (mostly dumb) errors that led
to an out of bounds access, e.g., the wrong endpoint variable in a for( ) loop. What
would others advise as a next step?
Here is the last part of the screen
> fit2 <- coxph(list(Surv(tstart, tstop, bstat) ~ 1,
+ c(1:4):5 ~ age / common + shared), id= id, istate=bili4,
+ data=pbc2, ties='breslow', x=TRUE)
> surv2 <- survfit(fit2, newdata=list(age=50), p0=c(.4, .3, .2, .1, 0))
> test2 <- mysurv(fit2, pbc2$bili4, p0= 4:0/10, fit2, x0 =50)
==31730== Invalid read of size 8
==31730== at 0x298A07: Rf_allocVector3 (memory.c:2861)
==31730== by 0x299B2C: Rf_allocVector (Rinlinedfuns.h:595)
==31730== by 0x299B2C: R_alloc (memory.c:2330)
==31730== by 0x3243C6: do_which (summary.c:1152)
==31730== by 0x23D8EF: bcEval (eval.c:7724)
==31730== by 0x25731F: Rf_eval (eval.c:1152)
==31730== by 0x25927D: R_execClosure (eval.c:2362)
==31730== by 0x25A35A: R_execMethod (eval.c:2535)
==31730== by 0x887E93F: R_dispatchGeneric (methods_list_dispatch.c:1151)
==31730== by 0x2A0E72: do_standardGeneric (objects.c:1344)
==31730== by 0x2577E7: Rf_eval (eval.c:1254)
==31730== by 0x25927D: R_execClosure (eval.c:2362)
==31730== by 0x25A01C: applyClosure_core (eval.c:2250)
==31730== Address 0x10 is not stack'd, malloc'd or (recently) free'd
==31730==
*** caught segfault ***
address 0x10, cause 'memory not mapped'
Traceback:
1: which(smap == j)
2: which(smap == j)
3: mysurv(fit2, pbc2$bili4, p0 = 4:0/10, fit2, x0 = 50)
The offending call is amost certainly the one to survfit; mysurv() is a local function
that caculates some things 'by hand'. It does nothing complex: counts, loops, etc, the
only non-base action is a call to Matrix::exp near the end, but the which() failure is
well before that.
The session info just before the offending material:
> sessionInfo()
R Under development (unstable) (2024-02-07 r85873)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.3 LTS
Matrix products: default
BLAS: /usr/local/src/R-devel/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Chicago
tzcode source: system (glibc)
attached base packages:
[1] splines stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] Matrix_1.6-0
loaded via a namespace (and not attached):
[1] compiler_4.4.0 tools_4.4.0 grid_4.4.0 lattice_0.22-5
---
Footnote. The impetus for this is realizing that the robust variance for an
Aalen-Johansen was incorrect when there are case weights for a subject that vary over
time; a rare case but will occur with time dependent IPC weights. Carefully figuring
this out has been all I did for the last week, leading to a new routine survfitaj.c and
approx 14 pages of derivation and explanation in the methods.Rnw vignette. Subjects who
"change horses in midstream", i.e., swap from one curve to another mid-followup make the
code more complex. This arises out of the "extended Kaplan-Meier"; I am not a fan of
this statistically, but some will use it and expect my code to work.
--
Terry M Therneau, PhD
Department of Quantitative Health Sciences
Mayo Clinic
therneau using mayo.edu
"TERR-ree THUR-noh"
[[alternative HTML version deleted]]
More information about the R-devel
mailing list