[Rd] Warning on backslash sequences (was sprintf behavior)

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 28 15:40:16 CEST 2006


Thanks, Bill, that is helpful.

I've been running a prototype across R itself and CRAN.  It is clear that 
we do have lots of strings with escaped newlines (the escape is generated 
in the parser if there is an embedded newline) so they need to be an 
exception.  On my first version 99 CRAN packages got picked up on this, 
but about 40 are problems with escaped newlines only (often from 
inst/CITATION files).

I don't understand why \` is regarded as intentional.  It crops up in 
packages date and survival (the same code) in

        stop(paste("\`", .Generic, "' not meaningful for dates",
                    sep = ""))

which clearly should use sQuote(.Generic) in R, but there seems no reason 
to escape in either R or Splus.

On Wed, 27 Sep 2006, Bill Dunlap wrote:

>>> Splus's parser emits a warning when it sees a backslash
>>> outside of the recognized backslash sequence.  E.g.,
>>>
>>> > nchar("\Backslashed?")
>>>  [1] 12
>>>  Warning messages:
>>>    The initial backslash is ignored in \B -- not a recognized escape sequence.
>>>          Use \\ to make a backslash
>>>
>>> You might want to add that warning to R's parser.  I've
>>> seen the error in several R packages.  E.g.,
>>>
>>>  bayesmix/R/JAGScontrol.R:  text[4] <- "-inits.R\"\n\initialize\n"
>>>  SciViews/svDialogs/R/fixedDlg.wxPython.R:        if (length(grep("[\.]", basename(res))) == 0)
>>>
>>> The warning is mostly emitted when the error is benign, but it
>>> might help get people to think about what they are typing.
>>
>> I am not at all sure about this.  R's documentation says
>>  ...
>>         '\n'          newline
>>         '\r'          carriage return
>>         '\t'          tab
>>         '\b'          backspace
>>         '\a'          alert (bell)
>>         '\f'          form feed
>>         '\v'          vertical tab
>>         '\\'          backslash '\'
>>         '\nnn'        character with given octal code (1, 2 or 3 digits)
>>         '\xnn'        character with given hex code (1 or 2 hex digits)
>>         '\unnnn'      Unicode character with given code (1-4 hex digits)
>>         '\Unnnnnnnn'  Unicode character with given code (1-8 hex digits)
>>
>> so it is not an error in R.  People tend not to like being warned about
>> legitimate usage (and one can see this sort of thing being intentional in
>> machine-generated scripts: for example to escape spaces in file paths
>> and to escape line feeds).
>
> We had that same sort of argument here, but enough people were
> asking our support people about the matter that we put in
> the warning.  As I said, the warning is odd in that it warns
> about legitimate usage in the hopes that people will know to
> use "\\" for a backslash when they need to.
>
>> What exactly does Splus's parser allow as intentional?
>
> Splus currently "supports" (does not warn about)
>    \nnn  (1-3 octal digits)
>    \n, \t, \b, \r, \', \", and \`
> We do not support the \f, \v, \xnn, \unnnn, or \Unnnnnnnn.
> We should add the \f, \v, \a, and \xnn (as well as 0xnn for integers),
> but we overlooked those.  (Adding new backslash sequences is relatively
> safe: we have been warning about unrecognized \f's for for years so
> we shouldn't expect to find too many folks using \f where they intended
> either \\f or f.)
>
> We don't support unicode so we won't do anything with the \unnnn or
> \Unnnnnnnn.  That is something Splus does need to warn about to aid in
> porting stuff from R.
>
> Neither of the examples I showed cause any ill effect,
> but using the grep pattern of '[\.]' shows that he
> doesn't know that dots are taken literally inside of
> square brackets in regular expressions and the use of
> "\." outside of brackets would give incorrect results.
>
>>>  bayesmix/R/JAGScontrol.R:  text[4] <- "-inits.R\"\n\initialize\n"
>>>  SciViews/svDialogs/R/fixedDlg.wxPython.R:        if (length(grep("[\.]", basename(res))) == 0)
>
> In SciViews I also see
>   command <- sub("view\.[A-Za-z0-9\._]+[(]", "view(", command)
> which is almost certainly wrong and I suspect that its
>   cat(";for Options\AutoIndent: 0=Off, 1=follow language scoping and 2=copy from previous line\n",
> wants an extra \ before AutoIndent.
>
> ----------------------------------------------------------------------------
> Bill Dunlap
> Insightful Corporation
> bill at insightful dot com
> 360-428-8146
>
> "All statements in this message represent the opinions of the author and do
> not necessarily reflect Insightful Corporation policy or position."
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-devel mailing list