[Rd] incorrect output and segfaults from sprintf with %*d (PR#13667)

maechler at stat.math.ethz.ch maechler at stat.math.ethz.ch
Wed Apr 22 16:10:25 CEST 2009


>>>>> "vQ" == Wacek Kusnierczyk <waku at idi.ntnu.no>
>>>>>     on Tue, 21 Apr 2009 13:05:11 +0200 (CEST) writes:

    vQ> Full_Name: Wacek Kusnierczyk
    vQ> Version: 2.10.0 r48365
    vQ> OS: Ubuntu 8.04 Linux 32bit
    vQ> Submission from: (NULL) (129.241.110.141)


    vQ> sprintf has a documented limit on strings included in the output using the
    vQ> format '%s'.  It appears that there is a limit on the length of strings included
    vQ> with, e.g., the format '%d' beyond which surprising things happen (output
    vQ> modified for conciseness):

    vQ> gregexpr('1', sprintf('%9000d', 1))
    vQ> # [1] 9000 9801

    vQ> gregexpr('1', sprintf('%9000d', 1))
    vQ> # [1]  9000  9801 10602

    vQ> gregexpr('1', sprintf('%9000d', 1))
    vQ> # [1]  9000  9801 10602 11403

    vQ> gregexpr('1', sprintf('%9000d', 1))
    vQ> # [1]  9000  9801 10602 11403 12204

    vQ> ...

    vQ> Note that not only more than one '1' is included in the output, but also that
    vQ> the same functional expression (no side effects used beyond the interface) gives
    vQ> different results on each execution.  Analogous behaviour can be observed with
    vQ> '%nd' where n > 8200.

    vQ> The actual output above is consistent across separate sessions.

    vQ> With sufficiently large field width values, R segfaults:

    vQ> sprintf('%*d', 10^5, 1)
    vQ> # *** caught segfault ***
    vQ> # address 0xbfcfc000, cause 'memory not mapped'
    vQ> # Segmentation fault


Thank you, Wacek.
That's all ``interesting''  ... unfortunately, 

my version of  'man 3 sprintf' contains

>> BUGS
>>        Because sprintf() and vsprintf() assume an arbitrarily
>>        long string, callers must be careful not to overflow the
>>        actual space; this is often impossible to assure. Note
>>        that the length of the strings produced is
>>        locale-dependent and difficult to predict.  Use
>>        snprintf() and vsnprintf() instead (or asprintf() and vasprintf).

(note the "impossible" part above)       
and we haven't used  snprintf() yet, probably because it
requires the  C99 C standard, and AFAIK, we have only relatively
recently started to more or less rely on C99 in the R sources.
       
More precisely, I see that some windows-only code relies on
snprintf() being available  whereas in at least on non-Windows
section, I read   /* we cannot assume snprintf here */

Now such platform dependency issues and corresponding configure
settings I do typically leave to other R-corers with a much
wider overview about platforms and their compilers and C libraries.
       

BTW,  
1) sprintf("%n %g", 1,1)   also seg.faults

2) Did you have a true use case where  the  8192  limit was an
   undesirable limit?

Martin       

    vQ> sessionInfo()
    vQ> # R version 2.10.0 Under development (unstable) (2009-04-20 r48365) 
    vQ> # i686-pc-linux-gnu



More information about the R-devel mailing list