[Rd] R on Solaris 10 x64

Herve Pages hpages at fhcrc.org
Sat Apr 14 03:36:21 CEST 2007


Hi David,

Tai-Wei (David) Lin wrote:
> Hi R Developers,
> 
> Greg is helping me with debugging R on Solaris 10 x64. Please let us
> know if you have any thoughts or tips that can help us debug this.
> 
> Thanks,
> 
> David
> 
> 
> 
> ************
> Using default transfer plist
> in vector_io: permuting
> About to write
> 
>  *** caught segfault ***
> address e8554000, cause 'memory not mapped'
> 
> Traceback:
>  1: .External("do_hdf5save", call, sys.frame(sys.parent()), fileout,
>  ..., PACKAGE = "hdf5")
>  2: hdf5save(hdf5_Fstat, "Fstat", "geneNames", "genotype")
> aborting ...
> ************
> 
> We've tried many things to debug it:
> 
> * dbx Runtime Checking (RTC) is not detecting any (meaningful) memory
> access problems that I can see.
> 
> * The same on Solaris/SPARC.
> 
> * Neither does Valgrind on Linux.
> 
> * I've tried increasing the C stack size, assuming R could be running
> out of stack size. Didn't help.
> 
> Running R under dbx (without RTC) until the crash shows this:
> 
> ...
> About to write
> t at 1 (l at 1) signal SEGV (no mapping at the fault address) in _memcpy at
> 0xfe90444b
> 0xfe90444b: _memcpy+0x006b:     movaps   0x00000000(%esi),%xmm0
> Current function is H5D_select_mgath
>   379               HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len);
> (dbx) where
> current thread: t at 1
>   [1] _memcpy(0x0, 0xfdebc707, 0x9f5c4f0), at 0xfe90444b
> =>[2] H5D_select_mgath(_buf = 0x9f79580, space = 0x8966770, iter =
> 0x8045980, nelmts = 3120U, dxpl_cache = 0xfe170078, _tgath_buf =
> 0x9f5c4f0), line 379 in "H5Dselect.c"
>   [3] H5D_contig_write(io_info = 0x804620c, nelmts = 3120ULL, mem_type =
> 0x97b05c8, mem_space = 0x8966770, file_space = 0x8966770, tpath =
> 0x8ee7078, src_id = 201326906, dst_id = 201326904, buf = 0x9f79580),
> line 1418 in "H5Dio.c"
>   [4] H5D_write(dataset = 0x8f169c0, mem_type_id = 201326906, mem_space
> = 0x8966770, file_space = 0x8966770, dxpl_id = 671088643, buf =
> 0x9f79580), line 952 in "H5Dio.c"
>   [5] H5Dwrite(dset_id = 335544330, mem_type_id = 201326906,
> mem_space_id = 0, file_space_id = 0, plist_id = 671088643, buf =
> 0x9f79580), line 586 in "H5Dio.c"
>   [6] vector_io(call = 0x97234ec, writeflag = 1, dataset = 335544330,
> space = 268435472, obj = 0x98386a0), line 535 in "hdf5.c"
>   [7] hdf5_write_vector(call = 0x97234ec, id = 67108867, symname =
> 0x9cf35d0 "geneNames", val = 0x98386a0), line 693 in "hdf5.c"
>   [8] hdf5_save_object(call = 0x97234ec, fid = 67108867, symname =
> 0x9cf35d0 "geneNames", val = 0x98386a0), line 957 in "hdf5.c"
>   [9] do_hdf5save(args = 0x9723284), line 1104 in "hdf5.c"
>   [10] do_External(call = 0x86d62bc, op = 0x8371cd8, args = 0x972340c,
> env = 0x9723594), line 832 in "dotcode.c"
>   [11] Rf_eval(e = 0x86d62bc, rho = 0x9723594), line 445 in "eval.c"
>   [12] Rf_evalList(el = 0x86d6230, rho = 0x9723594, op = 0x837226c),
> line 1463 in "eval.c"
>   [13] Rf_eval(e = 0x86d6214, rho = 0x9723594), line 438 in "eval.c"
>   [14] do_begin(call = 0x86d56bc, op = 0x836709c, args = 0x86d61dc, rho
> = 0x9723594), line 1107 in "eval.c"
>   [15] Rf_eval(e = 0x86d56bc, rho = 0x9723594), line 431 in "eval.c"
>   [16] Rf_applyClosure(call = 0x9723738, op = 0x83c0328, arglist =
> 0x97236e4, rho = 0x8379b1c, suppliedenv = 0x8379b38), line 614 in "eval.c"
>   [17] Rf_eval(e = 0x9723738, rho = 0x8379b1c), line 455 in "eval.c"
>   [18] Rf_ReplIteration(rho = 0x8379b1c, savestack = 0, browselevel = 0,
> state = 0x8047328), line 256 in "main.c"
>   [19] R_ReplConsole(rho = 0x8379b1c, savestack = 0, browselevel = 0),
> line 305 in "main.c"
>   [20] run_Rmainloop(), line 944 in "main.c"
>   [21] Rf_mainloop(), line 951 in "main.c"
>   [22] main(ac = 4, av = 0x80477ac), line 33 in "Rmain.c"
> (dbx) p curr_len
> curr_len = 24960U
> (dbx) p curr_seq
> curr_seq = 0
> (dbx) p of
> dbx: "of" is not defined in the scope
> `libhdf5.so.0.0.0`H5Dselect.c`H5D_select_mgath:347`
> dbx: see `help scope' for details
> (dbx) p off
> off = 0x8042960
> (dbx) p tgath_buf
> tgath_buf = 0x9f5c4f0
> "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H"
> (dbx) p buf
> buf = 0x9f79580
> "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H"
> (dbx) p nseq
> nseq = 1U
> (dbx) p len
> len = 0x804195c
> (dbx) p len[0..2]
> len[0..2] =
> [0] = 24960U
> [1] = 140025512U
> [2] = 140013048U
> (dbx)
> 
> 
> The R code in question is:
> 
> ...
>         /* Loop, while sequences left to process */
>         for(curr_seq=0; curr_seq<nseq; curr_seq++) {
>             /* Get the number of bytes in sequence */
>             curr_len=len[curr_seq];
> 
>             HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len);
> 
>             /* Advance offset in gather buffer */
>             tgath_buf+=curr_len;
>         } /* end for */
> ...


What's the initial size of tgath_buf? You need to make sure that you are not stepping
out of it i.e. that sum(len[i], for 0<=i<nseq) is not greater than its initial size.
That's for the writing side. Same on the reading side: you need to make sure that
buf+off[curr_seq]+len[i]-1 is a safe place to be for any 0<=i<nseq.

Otherwise, expect bad things to happen. And they are generally not reproducible in a
consistent way. So even if this code never crashes on other systems, it doesn't mean that
it is not broken.

Cheers,
H.


> 
> where
> 
> ./src/hdf5-1.6.5/src/H5private.h:
> #define HDmemcpy(X,Y,Z)  memcpy((char*)(X),(const char*)(Y),Z)
> 
> Maybe the "curr_len = 24960U" value is too high. I have no way of
> knowing what it should be in this case.
> 
> The crash could be caused by a compiler bug, although it's not very
> likely. These crashes have occurred both with and without optimization,
> with and without -g.
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list