[Rd] sub and gsub treat \\ incorrectly (PR#13454)

Andriy Miranskyy amiransk at uwo.ca
Tue Jan 20 07:02:35 CET 2009


Thank you, William! This makes things clearer.

I am trying to create a tiny converter of free text to Tex format. In
order to do that I need to replace all "_" with "\_" and all "&"
with "\&". Could you please tell me, is there a way of doing it using
gsub?

Regards,
Andriy

Monday, January 19, 2009, 6:24:56 PM, you wrote:

>> -----Original Message-----
>> From: r-devel-bounces at r-project.org 
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of amiransk at uwo.ca
>> Sent: Monday, January 19, 2009 10:25 AM
>> To: r-devel at stat.math.ethz.ch
>> Cc: R-bugs at r-project.org
>> Subject: [Rd] sub and gsub treat \\ incorrectly (PR#13454)
>> 
>> Sub and gsub treat \\ replacement pattern incorrectly
>> 
>> I expect
>>   sub("a","\\", "a", perl=T)
>> to produce
>>   [1] "\"
>> instead it generates
>>   [1] ""
>> 
>> On the other hand, if I run
>>   sub("a","\\\\", "a", perl=T)
>> it correctly outputs
>>   [1] "\\"

> The replacement pattern may include \\digit, which means
> to put the digit'th parenthesized subexpression into the
> replacement.  E.g.
>    > sub("([[:alpha:]]+) +([[:alpha:]]+)", "\\2 \\1", "One two three
> four five")
>    [1] "two One three four five"
>    > gsub("([[:alpha:]]+) +([[:alpha:]]+)", "\\2 \\1", "One two three
> four five")
>    [1] "two One four three five"
> To support this without ambiguity or surprises, \\ is expected
> to be followed by a digit (or L or U when perl=TRUE).

> When fixed=TRUE then there is no possibility of a parenthesized
> subexpression so \\2 is taken literally.

> help(gsub) is not explicit about this behavior.

> Because I initially made the same mistake, when I wrote the S+
> versions of gsub and sub I included a warning when the replacement
> included a \\ not followed by a digit:

  >> gsub("([[:alpha:]]+) +([[:alpha:]]+)", "\\ \\", "One two three four
> five")
>   [1] "    five"
>   Warning messages:
>     backslash in replacement argument of substituteString(fixed=F) is
> not
>           followed by backslash or digit, hence backslash is omitted in:
> substit\
>           uteString(pattern = pattern, replacement = replacement, x = x,
> extended ....

>> The same issue applies to gsub.
>> 
>> --please do not edit the information below--
>> 
>> Version:
>>  platform = i386-pc-mingw32
>>  arch = i386
>>  os = mingw32
>>  system = i386, mingw32
>>  status = 
>>  major = 2
>>  minor = 8.1
>>  year = 2008
>>  month = 12
>>  day = 22
>>  svn rev = 47281
>>  language = R
>>  version.string = R version 2.8.1 (2008-12-22)
>> 
>> Windows XP (build 2600) Service Pack 2
>> 
>> Locale:
>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
>> States.1252;LC_MONETARY=English_United 
>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>> 
>> Search Path:
>>  .GlobalEnv, package:stats, package:graphics, 
>> package:grDevices, package:utils, package:datasets, 
>> package:methods, Autoloads, package:base
>> 
>> -- 
>> Sincerely,
>>  Andriy
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>



More information about the R-devel mailing list