[Rd] Segfault in GetNewPage, memory.c.
Guillaume Yziquel
guillaume.yziquel at citycable.ch
Thu Jan 7 23:50:51 CET 2010
Hello.
I'm still working on my OCaml-R binding and I get a segfault in the
GetNewPage() function of memory.c.
For the record, the OCaml-R binding seems to work fine with OCaml
bytecode. The segfault here is the main issue I have with OCaml native
code. OCaml-R can be found on the following links.
Source code:
http://yziquel.homelinux.org/gitweb/?p=ocaml-r.git;a=summary
http://svn.gna.org/viewcvs/ocaml-r/branches/yziquel/
Debian packages for amd64:
http://yziquel.homelinux.org/debian/pool/main/o/ocaml-r/
Some documentation (not entirely up to date...):
http://yziquel.homelinux.org/topos/api/ocaml-r/R.html
http://yziquel.homelinux.org/topos/debian-ocamlr.html
Back to my segfault:
> yziquel at seldon:~/git/ocaml-finquote$ gdb -silent -d /home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
> Reading symbols from /home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging symbols found)...done.
> (gdb) run
> Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native
> [Thread debugging using libthread_db enabled]
>
> Program received signal SIGSEGV, Segmentation fault.
> GetNewPage (node_class=1) at memory.c:657
> 657 SNAP_NODE(s, base);
> (gdb) backtrace
> #0 GetNewPage (node_class=1) at memory.c:657
> #1 0x00007ffff7993c24 in Rf_allocVector (type=16, length=1) at memory.c:2030
> #2 0x00007ffff7981070 in Rf_mkString (s=0x6ae548 "require(quantmod)") at ../../src/include/Rinlinedfuns.h:582
> #3 0x000000000047d63f in parse_sexp ()
> #4 0x0000000000498990 in caml_c_call ()
> #5 0x00007ffff7fb37e8 in ?? ()
> #6 0x0000000000423aa0 in camlQuantmod__entry ()
> #7 0x00007ffff7fb5820 in ?? ()
> #8 0x0000000000421649 in caml_program ()
> #9 0x000000000012697e in ?? ()
> #10 0x00000000004989e6 in caml_start_program ()
> #11 0x0000000000000000 in ?? ()
> (gdb)
As OCaml is compiled to native machine code, it has its own ABI, and
this is why you do not see much traceback on the OCaml side.
The segfault happens at the moment that we try to do
"require(quantmod)". The R interpreter is already up an running when we
execute this R command.
I wish to point out that this same piece of code works fine in OCaml
bytecode, which sorts of implies that my C glue is rather OK.
From source code, the execution goes this way:
> let () = ignore (R.eval_string "require(quantmod)")
We're simply trying to evaluate the "require(quantmod)" string in R.
> let eval_string s = eval_langsxp (parse_sexp s)
eval_string calls
> external parse_sexp : string -> sexp = "parse_sexp"
which access the C glue code wrapping R_ParseVector.
> CAMLprim value parse_sexp (value s) {
> CAMLparam1(s);
> SEXP text ;
> SEXP pr ;
> ParseStatus status;
> PROTECT(text = mkString(String_val(s)));
> PROTECT(pr=R_ParseVector(text, 1, &status, R_NilValue));
> UNPROTECT(2);
> switch (status) {
> case PARSE_OK:
> break;
> case PARSE_INCOMPLETE:
> case PARSE_EOF:
> caml_raise_with_string(*caml_named_value("Parse_incomplete"), (String_val(s)));
> case PARSE_NULL:
> case PARSE_ERROR:
> caml_raise_with_string(*caml_named_value("Parse_error"), (String_val(s)));
> }
> CAMLreturn(Val_sexp(VECTOR_ELT(pr,0)));
> }
But before calling ParseVector, it allocates an R string with the command.
> PROTECT(text = mkString(String_val(s)));
It is this call to mkString which gives the segfault. String_val
essentially is a macro that casts an OCaml value to a char *.
> yziquel at seldon:~/git/ocaml-finquote$ gdb -silent -d /home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
> Reading symbols from /home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging symbols found)...done.
> (gdb) set breakpoint pending on
> (gdb) break Rf_mkString
> Breakpoint 1 at 0x420858
> (gdb) run
> Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native
> [Thread debugging using libthread_db enabled]
>
> Breakpoint 1, Rf_mkString (s=0x6ae548 "require(quantmod)") at ../../src/include/Rinlinedfuns.h:582
> 582 PROTECT(t = allocVector(STRSXP, 1));
> (gdb) step
> 579 {
> (gdb)
> 582 PROTECT(t = allocVector(STRSXP, 1));
> (gdb)
> Rf_allocVector (type=16, length=1) at memory.c:1916
> 1916 {
> (gdb) next
> 1924 if (length < 0 )
> (gdb)
> 1928 switch (type) {
> (gdb)
> 1978 if (length <= 0)
> (gdb)
> 1984 size = PTR2VEC(length);
> (gdb)
> 2000 if (size <= NodeClassSize[1]) {
> (gdb)
> 2017 old_R_VSize = R_VSize;
> (gdb)
> 2020 if (FORCE_GC || NO_FREE_NODES() || VHEAP_FREE() < alloc_size) {
> (gdb)
> 2017 old_R_VSize = R_VSize;
> (gdb)
> 2020 if (FORCE_GC || NO_FREE_NODES() || VHEAP_FREE() < alloc_size) {
> (gdb)
> 2028 if (size > 0) {
> (gdb)
> 2029 if (node_class < NUM_SMALL_NODE_CLASSES) {
> (gdb)
> 2030 CLASS_GET_FREE_NODE(node_class, s);
> (gdb)
>
> Program received signal SIGSEGV, Segmentation fault.
> GetNewPage (node_class=1) at memory.c:657
> 657 SNAP_NODE(s, base);
> (gdb)
So CLASS_GET_FREE_NODE is #defined in memory.c as:
> #define CLASS_GET_FREE_NODE(c,s) do { \
> SEXP __n__ = R_GenHeap[c].Free; \
> if (__n__ == R_GenHeap[c].New) { \
> GetNewPage(c); \
> __n__ = R_GenHeap[c].Free; \
> } \
> R_GenHeap[c].Free = NEXT_NODE(__n__); \
> R_NodesInUse++; \
> (s) = __n__; \
> } while (0)
and we here have a call to GetNewPage.
> yziquel at seldon:~/git/ocaml-finquote$ gdb -silent -d /home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
> Reading symbols from /home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging symbols found)...done.
> (gdb) set breakpoint pending on
> (gdb) break GetNewPage
> Function "GetNewPage" not defined.
> Breakpoint 1 (GetNewPage) pending.
> (gdb) run
> Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native
> [Thread debugging using libthread_db enabled]
>
> Breakpoint 1, GetNewPage (node_class=1) at memory.c:629
> 629 {
> (gdb) n
> 635 node_size = NODE_SIZE(node_class);
> (gdb)
> 638 page = malloc(R_PAGE_SIZE);
> (gdb)
> 639 if (page == NULL) {
> (gdb)
> 638 page = malloc(R_PAGE_SIZE);
> (gdb)
> 639 if (page == NULL) {
> (gdb)
> 646 R_ReportNewPage();
> (gdb)
> 648 page->next = R_GenHeap[node_class].pages;
> (gdb)
> 653 base = R_GenHeap[node_class].New;
> (gdb)
> 648 page->next = R_GenHeap[node_class].pages;
> (gdb)
> 653 base = R_GenHeap[node_class].New;
> (gdb)
> 648 page->next = R_GenHeap[node_class].pages;
> (gdb)
> 653 base = R_GenHeap[node_class].New;
> (gdb)
> 648 page->next = R_GenHeap[node_class].pages;
> (gdb)
> 650 R_GenHeap[node_class].PageCount++;
> (gdb)
> 654 for (i = 0; i < page_count; i++, data += node_size) {
> (gdb)
> 649 R_GenHeap[node_class].pages = page;
> (gdb)
> 653 base = R_GenHeap[node_class].New;
> (gdb)
> 654 for (i = 0; i < page_count; i++, data += node_size) {
> (gdb)
> 652 data = PAGE_DATA(page);
> (gdb)
> 669 SET_NODE_CLASS(s, node_class);
> (gdb)
> 657 SNAP_NODE(s, base);
> (gdb)
>
> Program received signal SIGSEGV, Segmentation fault.
> GetNewPage (node_class=1) at memory.c:657
> 657 SNAP_NODE(s, base);
> (gdb)
and SNAP_NODE is:
> /* snap in node s before node t */
> #define SNAP_NODE(s,t) do { \
> SEXP sn__n__ = (s); \
> SEXP next = (t); \
> SEXP prev = PREV_NODE(next); \
> SET_NEXT_NODE(sn__n__, next); \
> SET_PREV_NODE(next, sn__n__); \
> SET_NEXT_NODE(prev, sn__n__); \
> SET_PREV_NODE(sn__n__, prev); \
> } while (0)
I do not know how to track the segfault further except by looking at the
disassembled machine code. However, as the machine code seems to have
been optimised, it is not that easily readable, and I would appreciate
if someone could take the time to look into any obvious reasons why
there may be a segfault. Any background information or pointers helping
me to understand what is precisely supposed to be going on in the memory
allocation code would also be very much appreciated.
All the best,
--
Guillaume Yziquel
http://yziquel.homelinux.org/
More information about the R-devel
mailing list