[Rd] Help: malloc/free deadlock in unsafe signal handler 'Rf_onsigusr1'

Ming Li mli at pivotal.io
Thu Jul 28 10:35:51 CEST 2016


Hi all,

I am working on a bug,  which running PLR on HAWQ. The process hung and
can't be terminated.

>From my investigation, it seems signal handler 'Rf_onsigusr1' trigger a
malloc/free deadlock.

The calling stack is below.

Thread 1 (Thread 0x7f4c93af48e0 (LWP 431263)):
#0  0x00007f4c9015805e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007f4c900dd16b in _L_lock_9503 () from /lib64/libc.so.6
#2  0x00007f4c900da6a6 in malloc () from /lib64/libc.so.6
#3  0x00007f4c9008fb39 in _nl_make_l10nflist () from /lib64/libc.so.6
#4  0x00007f4c9008ddf5 in _nl_find_domain () from /lib64/libc.so.6
#5  0x00007f4c9008d6e0 in __dcigettext () from /lib64/libc.so.6
#6  0x00007f4c6fabcfe3 in Rf_onsigusr1 () from /usr/local/lib64/R/lib/libR.so
#7  <signal handler called>
#8  0x00007f4c9014079a in brk () from /lib64/libc.so.6
#9  0x00007f4c90140845 in sbrk () from /lib64/libc.so.6
#10 0x00007f4c900dd769 in __default_morecore () from /lib64/libc.so.6
#11 0x00007f4c900d87a2 in _int_free () from /lib64/libc.so.6
#12 0x0000000000b3ff24 in gp_free2 ()
#13 0x0000000000b356fc in AllocSetDelete ()
#14 0x0000000000b38391 in MemoryContextDeleteImpl ()
#15 0x000000000077c851 in ExecEndAgg ()
#16 0x00000000007592ad in ExecEndNode ()
#17 0x000000000075186c in ExecEndPlan ()
#18 0x000000000079dffa in ExecEndSubqueryScan ()
#19 0x000000000075921d in ExecEndNode ()
#20 0x000000000075186c in ExecEndPlan ()
#21 0x0000000000752565 in ExecutorEnd ()
#22 0x00000000006dd9bd in PortalCleanup ()
#23 0x0000000000b3f077 in AtCommit_Portals ()
#24 0x000000000051abe5 in CommitTransaction ()
#25 0x000000000051f1d5 in CommitTransactionCommand ()
#26 0x000000000099809e in PostgresMain ()
#27 0x00000000008f1031 in BackendStartup ()
#28 0x00000000008f70e0 in PostmasterMain ()
#29 0x00000000007f63da in main ()


I googled and found below info maybe useful to fix it: The best way to
avoid this kind of deadlock is to Call only asynchronous-safe functions
within signal handlers.

https://www.securecoding.cert.org/confluence/display/c/SIG30-C.+Call+only+asynchronous-safe+functions+within+signal+handlers

Thanks a lot.

	[[alternative HTML version deleted]]



More information about the R-devel mailing list