[Bioc-devel] [External] Re: Strange "internal logical NA value has been modified" error

Hervé Pagès hp@ge@@on@g|thub @end|ng |rom gm@||@com
Sat Oct 16 21:49:24 CEST 2021


Hi Pariksheet,

Interesting, and thanks for sharing.

Note that a situation where it's actually safe to modify R values passed 
to C is when you allocate the vector in R right before going to C. For 
example in your case I think that doing something like

     converged <- logical(1)

     converged[1] <- NA

     .Call(C_train,

           PACKAGE = "tsshmm",

           converged,

           ...

would have been ok.

That being said, it's certainly better style to stick to the "treat R 
values passed to C as read-only" rule, especially since it's almost 
always easy to do (in my experience). For example in your case, there's 
no need for the 'converged' argument. You could simply create this SEXP 
at the C level and have your .Call entry point return it, as I'm sure 
you figured out already.

Best,
H.


On 16/10/2021 08:54, Pariksheet Nanda wrote:
> Hi folks,
> 
> The fault was my directly overwriting the value of an R object pointing 
> to NA with an integer value:
> 
> # ./R/TSSHMM-class.R
>          ...
>          converged <- NA
>          .Call(C_train,
>                PACKAGE = "tsshmm",
>                converged,
>                ...
> 
> # src/R_wrap_tsshmm.c
> ...
> SEXP C_train(SEXP converged, ...)
> {
>    ...
>    INTEGER(converged)[0] = 0;
>    ...
> 
> To help me locate the issue, Martin Morgan e-mailed me off-list with 
> advice for treating R values passed to C as "read-only", and violating 
> this guideline indeed was the root of the bug.
> 
> I also wanted to quickly clear up a methodological error in my previous 
> e-mail about setting the gdb watchpoint to reproduce the fault.  It 
> turns out they way I set the watchpoint on the address of 
> R_LogicalNAValue gets triggered even when the NA value is relocated due 
> to benign events like package loads.  Instead, we want to set the 
> watchpoint at the location of the symbol value using `watch -location 
> SYMBOL`.  This survives SYMBOL relocations.  For completeness, below are 
> the gdb commands for automated setting of the watchpoint[2] and the gdb 
> output[3] of the fault.  Note that the gdb stack frame mentions the line 
> after the fault occurs; namely, the frame mentions line 61, but of 
> course line 60 is the offending line.
> 
> One last small detail: to get the stack trace without <optimized out>, I 
> had to recompile R with CFLAGS="-O0" and also needed to compile my 
> package for which I used devtools::with_debug(devtools::install())
> 
> Have a nice weekend,
> Pariksheet
> 
> 
> 
> I use `source -v /path/to/gdb/commands` to interleave
> the gdb commands prefixed with "+" in before their output;
> there's probably a gdb setting to do this, instead of
> resorting to this kludge of chaining scripts, but I
> couldn't find such a gdb setting.
> 
> [1] Contents of ~/src/tsshmm/test/wrap.gdb
> 
> # Echo gdb commands before gdb output:
> source -v ~/src/tsshmm/tests/batch.gdb
> 
> [2] Contents of ~/src/tsshmm/tests/batch.gdb
> 
> set pagination off
> echo *** Tell GDB where to find our package source files.\n
> directory ~/src/tsshmm/src
> echo *** Allow breakpoints on as yet to be loaded symbols, when using -x 
> to execute commands.\n
> set breakpoint pending on
> echo *** We can't set the watchpoint on R_LogicalNAValue unless we first 
> enter Rf_InitMemory.\n
> break memory.c:Rf_InitMemory
> run
> clear memory.c:Rf_InitMemory
> echo *** Wait for R_LogicalValue to be initialized.\n
> watch R_LogicalNAValue
> continue
> delete 2
> echo *** Set the watchpoint on the location of the NA value.\n
> print R_LogicalNAValue
> print * R_LogicalNAValue
> watch -l R_LogicalNAValue.u.symsxp.internal
> continue
> print * R_LogicalNAValue
> echo *** Now wait for for the memory corruption of R_LogicalNAValue.\n
> continue
> echo *** Bug reproduced! Some info:\n
> backtrace 10
> list
> print R_LogicalNAValue
> print * R_LogicalNAValue
> 
> [3] R command and gdb output:
> 
> omsai using xm1:/tmp/build-R$ Rscript -e 'library(devtools); 
> with_debug(install("~/src/tsshmm", quiet = TRUE))' && ./bin/R -q -d gdb 
> --debugger-args="-q -x ~/src/tsshmm/tests/wrap.gdb" -f 
> ~/src/tsshmm/tests/testthat.R --args DEBUG_INTERRUPT
> 
> Reading symbols from /tmp/build-R/bin/exec/R...
> +set pagination off
> +echo *** Tell GDB where to find our package source files.\n
> *** Tell GDB where to find our package source files.
> +directory ~/src/tsshmm/src
> +echo *** Allow breakpoints on as yet to be loaded symbols, when using 
> -x to execute commands.\n
> *** Allow breakpoints on as yet to be loaded symbols, when using -x to 
> execute commands.
> +set breakpoint pending on
> +echo *** We can't set the watchpoint on R_LogicalNAValue unless we 
> first enter Rf_InitMemory.\n
> *** We can't set the watchpoint on R_LogicalNAValue unless we first 
> enter Rf_InitMemory.
> +break memory.c:Rf_InitMemory
> Breakpoint 1 at 0x1881fe: file ../../../R-4.1.1/src/main/memory.c, line 
> 2136.
> +run
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> 
> Breakpoint 1, Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2136
> 2136        init_gctorture();
> +clear memory.c:Rf_InitMemory
> +echo *** Wait for R_LogicalValue to be initialized.\n
> *** Wait for R_LogicalValue to be initialized.
> +watch R_LogicalNAValue
> Hardware watchpoint 2: R_LogicalNAValue
> +continue
> 
> Hardware watchpoint 2: R_LogicalNAValue
> 
> Old value = (SEXP) 0x0
> New value = (SEXP) 0x555555a2f400
> Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2227
> 2227        LOGICAL(R_LogicalNAValue)[0] = NA_LOGICAL;
> +delete 2
> +echo *** Set the watchpoint on the location of the NA value.\n
> *** Set the watchpoint on the location of the NA value.
> +print R_LogicalNAValue
> $1 = (SEXP) 0x555555a2f400
> +print * R_LogicalNAValue
> $2 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
> 0, debug = 0, trace = 0, spare = 0, gcgen = 0, gccls = 1, named = 0, 
> extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f3c8, 
> gengc_prev_node = 0x555555a2f438, u = {primsxp = {offset = 1}, symsxp = 
> {pname = 0x1, value = 0x0, internal = 0x0}, listsxp = {carval = 0x1, 
> cdrval = 0x0, tagval = 0x0}, envsxp = {frame = 0x1, enclos = 0x0, 
> hashtab = 0x0}, closxp = {formals = 0x1, body = 0x0, env = 0x0}, promsxp 
> = {value = 0x1, expr = 0x0, env = 0x0}}}
> +watch -l R_LogicalNAValue.u.symsxp.internal
> Hardware watchpoint 3: -location R_LogicalNAValue.u.symsxp.internal
> +continue
> 
> Hardware watchpoint 3: -location R_LogicalNAValue.u.symsxp.internal
> 
> Old value = (struct SEXPREC *) 0x0
> New value = (struct SEXPREC *) 0x80000000
> Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2228
> 2228        MARK_NOT_MUTABLE(R_LogicalNAValue);
> +print * R_LogicalNAValue
> $3 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
> 0, debug = 0, trace = 0, spare = 0, gcgen = 0, gccls = 1, named = 0, 
> extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f3c8, 
> gengc_prev_node = 0x555555a2f438, u = {primsxp = {offset = 1}, symsxp = 
> {pname = 0x1, value = 0x0, internal = 0x80000000}, listsxp = {carval = 
> 0x1, cdrval = 0x0, tagval = 0x80000000}, envsxp = {frame = 0x1, enclos = 
> 0x0, hashtab = 0x80000000}, closxp = {formals = 0x1, body = 0x0, env = 
> 0x80000000}, promsxp = {value = 0x1, expr = 0x0, env = 0x80000000}}}
> +echo *** Now wait for for the memory corruption of R_LogicalNAValue.\n
> *** Now wait for for the memory corruption of R_LogicalNAValue.
> +continue
> [Detaching after fork from child process 2277]
> [Detaching after fork from child process 2280]
> [Detaching after fork from child process 2282]
>  > library(testthat)
>  > library(tsshmm)
> Loading required package: GenomicRanges
> Loading required package: stats4
> Loading required package: BiocGenerics
> Loading required package: parallel
> 
> Attaching package: ‘BiocGenerics’
> 
> The following objects are masked from ‘package:parallel’:
> 
>      clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
>      clusterExport, clusterMap, parApply, parCapply, parLapply,
>      parLapplyLB, parRapply, parSapply, parSapplyLB
> 
> The following objects are masked from ‘package:stats’:
> 
>      IQR, mad, sd, var, xtabs
> 
> The following objects are masked from ‘package:base’:
> 
>      anyDuplicated, append, as.data.frame, basename, cbind, colnames,
>      dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
>      grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
>      order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
>      rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
>      union, unique, unsplit, which.max, which.min
> 
> Loading required package: S4Vectors
> 
> Attaching package: ‘S4Vectors’
> 
> The following objects are masked from ‘package:base’:
> 
>      expand.grid, I, unname
> 
> Loading required package: IRanges
> Loading required package: GenomeInfoDb
>  >
>  > ## Debug memory corruption with GDB.
>  > if (commandArgs(trailingOnly = TRUE)[1] ==
> +     "DEBUG_INTERRUPT") {
> +     setwd("~/src/tsshmm/tests/")
> +     ## Stop to allow GDB to re-enable the hardware watchpoint.
> +     ##ps::ps_interrupt()
> +     ## Continue minimal testing.
> +     test_dir("testthat", package = "tsshmm",
> +              filter = "train|tss")
> + } else {
> +     test_check("tsshmm")
> + }
> [New Thread 0x7fffee157700 (LWP 2367)]
> [Detaching after fork from child process 2369]
> ✔ | F W S  OK | Context
> ⠏ |         0 | train
>                                                   'table' int 
> [1:1000000(1d)] 1 1 1 1 1 1 1 1 1 1 ...
>   - attr(*, "dimnames")=List of 1
>    ..$ obs: chr [1:1000000] "0" "1" "2" "3" ...
> INFO [2021-10-16 10:31:29]    0: Model initial transition and emission 
> matrices:
> INFO [2021-10-16 10:31:29] 0.9  0.05            0.05            | 0.90 
> 0.05  0.05
> INFO [2021-10-16 10:31:29]            1.0                       | 0.09 
> 0.90  0.01
> INFO [2021-10-16 10:31:29]            0.5  0.5                  | 0.09 
> 0.90  0.01
> INFO [2021-10-16 10:31:29] 1.0                                  | 0.09 
> 0.90  0.01
> INFO [2021-10-16 10:31:29]                      0.50  0.5       | 0.10 
> 0.45  0.45
> INFO [2021-10-16 10:31:29]                      0.45  0.1  0.45 | 0.10 
> 0.45  0.45
> INFO [2021-10-16 10:31:29] 0.5                             0.50 | 0.10 
> 0.45  0.45
> INFO [2021-10-16 10:31:29] Running Baum-Welch
> 
> Thread 1 "R" hit Hardware watchpoint 3: -location 
> R_LogicalNAValue.u.symsxp.internal
> 
> Old value = (struct SEXPREC *) 0x80000000
> New value = (struct SEXPREC *) 0x0
> C_train (converged=0x555555a2f400, obs=0x555571cc6580, 
> lengths=0x55555cdc53a0, dim=0x55556695e300, trans=0x55556bafffa0, 
> emis=0x555560695d70, emis_tied=0x5555642b47e8, start=0x55556bd13518) at 
> R_wrap_tsshmm.c:61
> 61      model_init(&model, &is_valid, INTEGER(dim), REAL(trans), 
> REAL(emis),
> +echo *** Bug reproduced! Some info:\n
> *** Bug reproduced! Some info:
> +backtrace 10
> #0  C_train (converged=0x555555a2f400, obs=0x555571cc6580, 
> lengths=0x55555cdc53a0, dim=0x55556695e300, trans=0x55556bafffa0, 
> emis=0x555560695d70, emis_tied=0x5555642b47e8, start=0x55556bd13518) at 
> R_wrap_tsshmm.c:61
> #1  0x000055555563ac59 in R_doDotCall (ofun=0x7ffff449c40a <C_train>, 
> nargs=8, cargs=0x7ffffff7a370, call=0x555560fbc550) at 
> ../../../R-4.1.1/src/main/dotcode.c:624
> #2  0x0000555555645bb3 in do_dotcall (call=0x555560fbc550, 
> op=0x555555a4a678, args=0x5555642a6c40, env=0x555564bc1978) at 
> ../../../R-4.1.1/src/main/dotcode.c:1281
> #3  0x0000555555693c8d in bcEval (body=0x555560fbe860, 
> rho=0x555564bc1978, useCache=TRUE) at ../../../R-4.1.1/src/main/eval.c:7117
> #4  0x0000555555680cd7 in Rf_eval (e=0x555560fbe860, rho=0x555564bc1978) 
> at ../../../R-4.1.1/src/main/eval.c:729
> #5  0x00005555556836a6 in R_execClosure (call=0x55556a106a08, 
> newrho=0x555564bc1978, sysparent=0x55555cf7bd30, rho=0x55555cf7bd30, 
> arglist=0x555564c24ed0, op=0x555564bf16a0) at 
> ../../../R-4.1.1/src/main/eval.c:1899
> #6  0x0000555555684103 in R_execMethod (op=0x555564bf16a0, 
> rho=0x555564c24fe8) at ../../../R-4.1.1/src/main/eval.c:2075
> #7  0x00007ffff4ab89f6 in R_dispatchGeneric (fname=0x5555615e6070, 
> ev=0x555564c24fe8, fdef=0x555560fd3fe0) at 
> ../../../../../R-4.1.1/src/library/methods/src/methods_list_dispatch.c:1145
> #8  0x00005555556e94ae in do_standardGeneric (call=0x555560fc22e8, 
> op=0x555555a55a60, args=0x555564c251e0, env=0x555564c24fe8) at 
> ../../../R-4.1.1/src/main/objects.c:1285
> #9  0x0000555555693796 in bcEval (body=0x555560fc22b0, 
> rho=0x555564c24fe8, useCache=TRUE) at ../../../R-4.1.1/src/main/eval.c:7074
> +list
> 56        SEXP emis, SEXP emis_tied, SEXP start)
> 57    {
> 58      ghmm_dmodel* model = NULL;
> 59      int is_valid = 0;
> 60      INTEGER(converged)[0] = 0;
> 61      model_init(&model, &is_valid, INTEGER(dim), REAL(trans), 
> REAL(emis),
> 62             INTEGER(emis_tied), REAL(start));
> 63      if (! is_valid) {
> 64        ghmm_dmodel_free(&model);
> 65        lerror("model provided is invalid!  See reasons above.");
> +print R_LogicalNAValue
> $4 = (SEXP) 0x555555a2f400
> +print * R_LogicalNAValue
> $5 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
> 1, debug = 0, trace = 0, spare = 0, gcgen = 1, gccls = 1, named = 65535, 
> extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f438, 
> gengc_prev_node = 0x555555a7e308, u = {primsxp = {offset = 1}, symsxp = 
> {pname = 0x1, value = 0x0, internal = 0x0}, listsxp = {carval = 0x1, 
> cdrval = 0x0, tagval = 0x0}, envsxp = {frame = 0x1, enclos = 0x0, 
> hashtab = 0x0}, closxp = {formals = 0x1, body = 0x0, env = 0x0}, promsxp 
> = {value = 0x1, expr = 0x0, env = 0x0}}}
> (gdb)
> 
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

-- 
Hervé Pagès

Bioconductor Core Team
hpages.on.github using gmail.com



More information about the Bioc-devel mailing list