[Rd] incorrect output and segfaults from sprintf with %*d (PR#13667)
maechler at stat.math.ethz.ch
maechler at stat.math.ethz.ch
Wed Apr 22 16:10:25 CEST 2009
>>>>> "vQ" == Wacek Kusnierczyk <waku at idi.ntnu.no>
>>>>> on Tue, 21 Apr 2009 13:05:11 +0200 (CEST) writes:
vQ> Full_Name: Wacek Kusnierczyk
vQ> Version: 2.10.0 r48365
vQ> OS: Ubuntu 8.04 Linux 32bit
vQ> Submission from: (NULL) (129.241.110.141)
vQ> sprintf has a documented limit on strings included in the output using the
vQ> format '%s'. It appears that there is a limit on the length of strings included
vQ> with, e.g., the format '%d' beyond which surprising things happen (output
vQ> modified for conciseness):
vQ> gregexpr('1', sprintf('%9000d', 1))
vQ> # [1] 9000 9801
vQ> gregexpr('1', sprintf('%9000d', 1))
vQ> # [1] 9000 9801 10602
vQ> gregexpr('1', sprintf('%9000d', 1))
vQ> # [1] 9000 9801 10602 11403
vQ> gregexpr('1', sprintf('%9000d', 1))
vQ> # [1] 9000 9801 10602 11403 12204
vQ> ...
vQ> Note that not only more than one '1' is included in the output, but also that
vQ> the same functional expression (no side effects used beyond the interface) gives
vQ> different results on each execution. Analogous behaviour can be observed with
vQ> '%nd' where n > 8200.
vQ> The actual output above is consistent across separate sessions.
vQ> With sufficiently large field width values, R segfaults:
vQ> sprintf('%*d', 10^5, 1)
vQ> # *** caught segfault ***
vQ> # address 0xbfcfc000, cause 'memory not mapped'
vQ> # Segmentation fault
Thank you, Wacek.
That's all ``interesting'' ... unfortunately,
my version of 'man 3 sprintf' contains
>> BUGS
>> Because sprintf() and vsprintf() assume an arbitrarily
>> long string, callers must be careful not to overflow the
>> actual space; this is often impossible to assure. Note
>> that the length of the strings produced is
>> locale-dependent and difficult to predict. Use
>> snprintf() and vsnprintf() instead (or asprintf() and vasprintf).
(note the "impossible" part above)
and we haven't used snprintf() yet, probably because it
requires the C99 C standard, and AFAIK, we have only relatively
recently started to more or less rely on C99 in the R sources.
More precisely, I see that some windows-only code relies on
snprintf() being available whereas in at least on non-Windows
section, I read /* we cannot assume snprintf here */
Now such platform dependency issues and corresponding configure
settings I do typically leave to other R-corers with a much
wider overview about platforms and their compilers and C libraries.
BTW,
1) sprintf("%n %g", 1,1) also seg.faults
2) Did you have a true use case where the 8192 limit was an
undesirable limit?
Martin
vQ> sessionInfo()
vQ> # R version 2.10.0 Under development (unstable) (2009-04-20 r48365)
vQ> # i686-pc-linux-gnu
More information about the R-devel
mailing list