[Rd] Infrequent but steady NULL-pointer caused segfault in as.POSIXlt.POSIXct (R 3.4.4)

William Dunlap wdun|@p @end|ng |rom t|bco@com
Fri Aug 2 16:50:52 CEST 2019


If you can run things on LInux try running a few iterations of that loop
under valgrind, setting gctorture(TRUE) before the loop.

% R --debugger=valgrind --silent
> gctorture(TRUE)
> for(i in 1:5) { ... body of your loop ... }

valgrind can show memory misuse that eventually will cause R to crash.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Aug 2, 2019 at 1:23 AM Sun Yijiang <sunyijiang using gmail.com> wrote:

> The R script I run daily for hours looks like this:
>
> while (!finish) {
>     Sys.sleep(0.1)
>     time = as.integer(format(Sys.time(), "%H%M")) # always crash here
>     if (new.data.timestamp() <= time)
>         next
>     # ... do some jobs for about 2 minutes ...
>     gc()
> }
>
> Basically it waits for new data, which comes in every 10 minutes, and
> do some jobs, then gc(), then loop again.  It works great most of the
> time, but crashes strangely once a month or so.  Although infrequent,
> it always crashes at the same place and gives the same error info,
> like this:
>
>  *** caught segfault ***
> address (nil), cause 'memory not mapped'
>
> Traceback:
>  1: as.POSIXlt.POSIXct(x, tz)
>  2: as.POSIXlt(x, tz)
>  3: format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...)
>  4: structure(format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...),
>   names = names(x))
>  5: format.POSIXct(Sys.time(), format = "%H%M")
>  6: format(Sys.time(), format = "%H%M")
>  7: format(Sys.time(), format = "%H%M")
> … …
>
> I looked into the dumped core with gdb, and found something very strange:
>
> gdb /usr/lib64/R/bin/exec/R ~/core.30387
> (gdb) bt 5
> #0  0x00007f1dca844ff1 in __strlen_sse2_pminub () from /lib64/libc.so.6
> #1  0x00007f1dcb20e8f9 in Rf_mkChar (name=0x0) at envir.c:3725
> #2  0x00007f1dcb1dc225 in do_asPOSIXlt (call=<optimized out>,
> op=<optimized out>, args=<optimized out>,
>     env=<optimized out>) at datetime.c:705
> #3  0x00007f1dcb22197f in bcEval (body=body using entry=0x4064b28,
> rho=rho using entry=0xc449d38, useCache=useCache using entry=TRUE)
>     at eval.c:6473
> #4  0x00007f1dcb230370 in Rf_eval (e=0x4064b28,
> rho=rho using entry=0xc449d38) at eval.c:624
> (More stack frames follow…)
>
> Tracing into src/main/datetime.c:705, it’s a simple string-making code:
> SET_STRING_ELT(tzone, 1, mkChar(R_tzname[0]));
>
> mkChar function is defined in envir.c:3725:
> 3723  SEXP mkChar(const char *name)
> 3724  {
> 3725      size_t len =  strlen(name);
> … …
>
> gdb shows that the string pointer (name=0x0) mkChar received is NULL,
> and subsequently strlen(NULL) caused the segfault.  But quite
> contradictorily, gdb shows the value passed to mkChar in the caller is
> valid:
>
> (gdb) frame 2
> #2  0x00007f1dcb1dc225 in do_asPOSIXlt (call=<optimized out>,
> op=<optimized out>, args=<optimized out>,
>     env=<optimized out>) at datetime.c:705
> 705 datetime.c: No such file or directory.
> (gdb) p tzname[0]
> $1 = 0x4cf39c0 “CST”
>
> R_tzname is an alias of tzname. (#define R_tzname tzname in the same file.)
>
> At first, I suspect that some library may have messed up the memory
> and accidentally zeroed tzname (a global variable).  But with this gdb
> trace, it shows that tzname is good, only that the pointer passed to
> mkChar magically changed to zero.  Like this:
>
> mkChar(tzname[0])  // tzname[0] is “CST”, address 0x4cf39c
> … …
> SEXP mkChar(const char *name)  // name should be 0x4cf39c, but gdb shows
> 0x0
> {
>     size_t len =  strlen(name);  // segfault, as name is NULL
> … …
>
> The only theory I can think of so far is that, on calling mkChar, the
> parameter passed on stack somehow got wiped out to zero by some buggy
> code in R or library.  At a higher level, what I see is this:  If you
> run format(Sys.time(), "%H%M”) a million times a day (together with
> other codes of course), once in a month or so this simple line can
> segfault.
>
> I’m lost in this confusion, could someone please help me find the
> right direction to further look into this problem?
>
> Regards,
> Steve
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list