[Bioc-devel] [External] Re: Strange "internal logical NA value has been modified" error

Pariksheet Nanda p@r|k@heet@n@nd@ @end|ng |rom uconn@edu
Sat Oct 16 17:54:17 CEST 2021


Hi folks,

The fault was my directly overwriting the value of an R object pointing 
to NA with an integer value:

# ./R/TSSHMM-class.R
         ...
         converged <- NA
         .Call(C_train,
               PACKAGE = "tsshmm",
               converged,
               ...

# src/R_wrap_tsshmm.c
...
SEXP C_train(SEXP converged, ...)
{
   ...
   INTEGER(converged)[0] = 0;
   ...

To help me locate the issue, Martin Morgan e-mailed me off-list with 
advice for treating R values passed to C as "read-only", and violating 
this guideline indeed was the root of the bug.

I also wanted to quickly clear up a methodological error in my previous 
e-mail about setting the gdb watchpoint to reproduce the fault.  It 
turns out they way I set the watchpoint on the address of 
R_LogicalNAValue gets triggered even when the NA value is relocated due 
to benign events like package loads.  Instead, we want to set the 
watchpoint at the location of the symbol value using `watch -location 
SYMBOL`.  This survives SYMBOL relocations.  For completeness, below are 
the gdb commands for automated setting of the watchpoint[2] and the gdb 
output[3] of the fault.  Note that the gdb stack frame mentions the line 
after the fault occurs; namely, the frame mentions line 61, but of 
course line 60 is the offending line.

One last small detail: to get the stack trace without <optimized out>, I 
had to recompile R with CFLAGS="-O0" and also needed to compile my 
package for which I used devtools::with_debug(devtools::install())

Have a nice weekend,
Pariksheet



I use `source -v /path/to/gdb/commands` to interleave
the gdb commands prefixed with "+" in before their output;
there's probably a gdb setting to do this, instead of
resorting to this kludge of chaining scripts, but I
couldn't find such a gdb setting.

[1] Contents of ~/src/tsshmm/test/wrap.gdb

# Echo gdb commands before gdb output:
source -v ~/src/tsshmm/tests/batch.gdb

[2] Contents of ~/src/tsshmm/tests/batch.gdb

set pagination off
echo *** Tell GDB where to find our package source files.\n
directory ~/src/tsshmm/src
echo *** Allow breakpoints on as yet to be loaded symbols, when using -x 
to execute commands.\n
set breakpoint pending on
echo *** We can't set the watchpoint on R_LogicalNAValue unless we first 
enter Rf_InitMemory.\n
break memory.c:Rf_InitMemory
run
clear memory.c:Rf_InitMemory
echo *** Wait for R_LogicalValue to be initialized.\n
watch R_LogicalNAValue
continue
delete 2
echo *** Set the watchpoint on the location of the NA value.\n
print R_LogicalNAValue
print * R_LogicalNAValue
watch -l R_LogicalNAValue.u.symsxp.internal
continue
print * R_LogicalNAValue
echo *** Now wait for for the memory corruption of R_LogicalNAValue.\n
continue
echo *** Bug reproduced! Some info:\n
backtrace 10
list
print R_LogicalNAValue
print * R_LogicalNAValue

[3] R command and gdb output:

omsai using xm1:/tmp/build-R$ Rscript -e 'library(devtools); 
with_debug(install("~/src/tsshmm", quiet = TRUE))' && ./bin/R -q -d gdb 
--debugger-args="-q -x ~/src/tsshmm/tests/wrap.gdb" -f 
~/src/tsshmm/tests/testthat.R --args DEBUG_INTERRUPT

Reading symbols from /tmp/build-R/bin/exec/R...
+set pagination off
+echo *** Tell GDB where to find our package source files.\n
*** Tell GDB where to find our package source files.
+directory ~/src/tsshmm/src
+echo *** Allow breakpoints on as yet to be loaded symbols, when using 
-x to execute commands.\n
*** Allow breakpoints on as yet to be loaded symbols, when using -x to 
execute commands.
+set breakpoint pending on
+echo *** We can't set the watchpoint on R_LogicalNAValue unless we 
first enter Rf_InitMemory.\n
*** We can't set the watchpoint on R_LogicalNAValue unless we first 
enter Rf_InitMemory.
+break memory.c:Rf_InitMemory
Breakpoint 1 at 0x1881fe: file ../../../R-4.1.1/src/main/memory.c, line 
2136.
+run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2136
2136        init_gctorture();
+clear memory.c:Rf_InitMemory
+echo *** Wait for R_LogicalValue to be initialized.\n
*** Wait for R_LogicalValue to be initialized.
+watch R_LogicalNAValue
Hardware watchpoint 2: R_LogicalNAValue
+continue

Hardware watchpoint 2: R_LogicalNAValue

Old value = (SEXP) 0x0
New value = (SEXP) 0x555555a2f400
Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2227
2227        LOGICAL(R_LogicalNAValue)[0] = NA_LOGICAL;
+delete 2
+echo *** Set the watchpoint on the location of the NA value.\n
*** Set the watchpoint on the location of the NA value.
+print R_LogicalNAValue
$1 = (SEXP) 0x555555a2f400
+print * R_LogicalNAValue
$2 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
0, debug = 0, trace = 0, spare = 0, gcgen = 0, gccls = 1, named = 0, 
extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f3c8, 
gengc_prev_node = 0x555555a2f438, u = {primsxp = {offset = 1}, symsxp = 
{pname = 0x1, value = 0x0, internal = 0x0}, listsxp = {carval = 0x1, 
cdrval = 0x0, tagval = 0x0}, envsxp = {frame = 0x1, enclos = 0x0, 
hashtab = 0x0}, closxp = {formals = 0x1, body = 0x0, env = 0x0}, promsxp 
= {value = 0x1, expr = 0x0, env = 0x0}}}
+watch -l R_LogicalNAValue.u.symsxp.internal
Hardware watchpoint 3: -location R_LogicalNAValue.u.symsxp.internal
+continue

Hardware watchpoint 3: -location R_LogicalNAValue.u.symsxp.internal

Old value = (struct SEXPREC *) 0x0
New value = (struct SEXPREC *) 0x80000000
Rf_InitMemory () at ../../../R-4.1.1/src/main/memory.c:2228
2228        MARK_NOT_MUTABLE(R_LogicalNAValue);
+print * R_LogicalNAValue
$3 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
0, debug = 0, trace = 0, spare = 0, gcgen = 0, gccls = 1, named = 0, 
extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f3c8, 
gengc_prev_node = 0x555555a2f438, u = {primsxp = {offset = 1}, symsxp = 
{pname = 0x1, value = 0x0, internal = 0x80000000}, listsxp = {carval = 
0x1, cdrval = 0x0, tagval = 0x80000000}, envsxp = {frame = 0x1, enclos = 
0x0, hashtab = 0x80000000}, closxp = {formals = 0x1, body = 0x0, env = 
0x80000000}, promsxp = {value = 0x1, expr = 0x0, env = 0x80000000}}}
+echo *** Now wait for for the memory corruption of R_LogicalNAValue.\n
*** Now wait for for the memory corruption of R_LogicalNAValue.
+continue
[Detaching after fork from child process 2277]
[Detaching after fork from child process 2280]
[Detaching after fork from child process 2282]
 > library(testthat)
 > library(tsshmm)
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
     clusterExport, clusterMap, parApply, parCapply, parLapply,
     parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

     IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
     union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

     expand.grid, I, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
 >
 > ## Debug memory corruption with GDB.
 > if (commandArgs(trailingOnly = TRUE)[1] ==
+     "DEBUG_INTERRUPT") {
+     setwd("~/src/tsshmm/tests/")
+     ## Stop to allow GDB to re-enable the hardware watchpoint.
+     ##ps::ps_interrupt()
+     ## Continue minimal testing.
+     test_dir("testthat", package = "tsshmm",
+              filter = "train|tss")
+ } else {
+     test_check("tsshmm")
+ }
[New Thread 0x7fffee157700 (LWP 2367)]
[Detaching after fork from child process 2369]
✔ | F W S  OK | Context
⠏ |         0 | train
                                                  'table' int 
[1:1000000(1d)] 1 1 1 1 1 1 1 1 1 1 ...
  - attr(*, "dimnames")=List of 1
   ..$ obs: chr [1:1000000] "0" "1" "2" "3" ...
INFO [2021-10-16 10:31:29]    0: Model initial transition and emission 
matrices:
INFO [2021-10-16 10:31:29] 0.9  0.05            0.05            | 0.90 
0.05  0.05
INFO [2021-10-16 10:31:29]            1.0                       | 0.09 
0.90  0.01
INFO [2021-10-16 10:31:29]            0.5  0.5                  | 0.09 
0.90  0.01
INFO [2021-10-16 10:31:29] 1.0                                  | 0.09 
0.90  0.01
INFO [2021-10-16 10:31:29]                      0.50  0.5       | 0.10 
0.45  0.45
INFO [2021-10-16 10:31:29]                      0.45  0.1  0.45 | 0.10 
0.45  0.45
INFO [2021-10-16 10:31:29] 0.5                             0.50 | 0.10 
0.45  0.45
INFO [2021-10-16 10:31:29] Running Baum-Welch

Thread 1 "R" hit Hardware watchpoint 3: -location 
R_LogicalNAValue.u.symsxp.internal

Old value = (struct SEXPREC *) 0x80000000
New value = (struct SEXPREC *) 0x0
C_train (converged=0x555555a2f400, obs=0x555571cc6580, 
lengths=0x55555cdc53a0, dim=0x55556695e300, trans=0x55556bafffa0, 
emis=0x555560695d70, emis_tied=0x5555642b47e8, start=0x55556bd13518) at 
R_wrap_tsshmm.c:61
61      model_init(&model, &is_valid, INTEGER(dim), REAL(trans), REAL(emis),
+echo *** Bug reproduced! Some info:\n
*** Bug reproduced! Some info:
+backtrace 10
#0  C_train (converged=0x555555a2f400, obs=0x555571cc6580, 
lengths=0x55555cdc53a0, dim=0x55556695e300, trans=0x55556bafffa0, 
emis=0x555560695d70, emis_tied=0x5555642b47e8, start=0x55556bd13518) at 
R_wrap_tsshmm.c:61
#1  0x000055555563ac59 in R_doDotCall (ofun=0x7ffff449c40a <C_train>, 
nargs=8, cargs=0x7ffffff7a370, call=0x555560fbc550) at 
../../../R-4.1.1/src/main/dotcode.c:624
#2  0x0000555555645bb3 in do_dotcall (call=0x555560fbc550, 
op=0x555555a4a678, args=0x5555642a6c40, env=0x555564bc1978) at 
../../../R-4.1.1/src/main/dotcode.c:1281
#3  0x0000555555693c8d in bcEval (body=0x555560fbe860, 
rho=0x555564bc1978, useCache=TRUE) at ../../../R-4.1.1/src/main/eval.c:7117
#4  0x0000555555680cd7 in Rf_eval (e=0x555560fbe860, rho=0x555564bc1978) 
at ../../../R-4.1.1/src/main/eval.c:729
#5  0x00005555556836a6 in R_execClosure (call=0x55556a106a08, 
newrho=0x555564bc1978, sysparent=0x55555cf7bd30, rho=0x55555cf7bd30, 
arglist=0x555564c24ed0, op=0x555564bf16a0) at 
../../../R-4.1.1/src/main/eval.c:1899
#6  0x0000555555684103 in R_execMethod (op=0x555564bf16a0, 
rho=0x555564c24fe8) at ../../../R-4.1.1/src/main/eval.c:2075
#7  0x00007ffff4ab89f6 in R_dispatchGeneric (fname=0x5555615e6070, 
ev=0x555564c24fe8, fdef=0x555560fd3fe0) at 
../../../../../R-4.1.1/src/library/methods/src/methods_list_dispatch.c:1145
#8  0x00005555556e94ae in do_standardGeneric (call=0x555560fc22e8, 
op=0x555555a55a60, args=0x555564c251e0, env=0x555564c24fe8) at 
../../../R-4.1.1/src/main/objects.c:1285
#9  0x0000555555693796 in bcEval (body=0x555560fc22b0, 
rho=0x555564c24fe8, useCache=TRUE) at ../../../R-4.1.1/src/main/eval.c:7074
+list
56        SEXP emis, SEXP emis_tied, SEXP start)
57    {
58      ghmm_dmodel* model = NULL;
59      int is_valid = 0;
60      INTEGER(converged)[0] = 0;
61      model_init(&model, &is_valid, INTEGER(dim), REAL(trans), REAL(emis),
62             INTEGER(emis_tied), REAL(start));
63      if (! is_valid) {
64        ghmm_dmodel_free(&model);
65        lerror("model provided is invalid!  See reasons above.");
+print R_LogicalNAValue
$4 = (SEXP) 0x555555a2f400
+print * R_LogicalNAValue
$5 = {sxpinfo = {type = 10, scalar = 1, obj = 0, alt = 0, gp = 0, mark = 
1, debug = 0, trace = 0, spare = 0, gcgen = 1, gccls = 1, named = 65535, 
extra = 0}, attrib = 0x555555a2d550, gengc_next_node = 0x555555a2f438, 
gengc_prev_node = 0x555555a7e308, u = {primsxp = {offset = 1}, symsxp = 
{pname = 0x1, value = 0x0, internal = 0x0}, listsxp = {carval = 0x1, 
cdrval = 0x0, tagval = 0x0}, envsxp = {frame = 0x1, enclos = 0x0, 
hashtab = 0x0}, closxp = {formals = 0x1, body = 0x0, env = 0x0}, promsxp 
= {value = 0x1, expr = 0x0, env = 0x0}}}
(gdb)



More information about the Bioc-devel mailing list