[Rd] sub('^', .....) bugs (PR#7742)

Gabor Grothendieck ggrothendieck at myway.com
Wed Mar 23 10:45:16 CET 2005

:     David> According to help(sub), the ^ should match the
:     David> zero-length string at the beginning of a string:
: yes, indeed.
:     David> sub('^','var',1:3) # "1" "2" "3"
:     David> sub('$','var',1:3) # "1var" "2var" "3var"
:     David> # This generates what I expected from the first case:
:     David> sub('^.','var',11:13)  # "var1" "var2" "var3"
: there are even more fishy things here:
: 1) In your cases, the integer 'x' argument is auto-coerced to
:    character, however that fails as soon as  'perl = TRUE' is used.
:  > sub('^','v_', 1:3, perl=TRUE)
:  Error in sub.perl(pattern, replacement, x, ignore.case) : 
: 	 invalid argument
:  {one can argue that this is not a bug, since the help file asks
:   for 'x' to be a character vector; OTOH, we have
:   as.character(.) magic in many other places, i.e. quite
:   naturally here;  
:   at least  perl=TRUE and perl=FALSE should behave consistently.}
: 2) The 'perl=TRUE' case behaves even more problematically here:
:   > sub('^','v_', LETTERS[1:3], perl=TRUE)
:   [1] "A\0e" "B\0J" "C\0S"
:   > sub('^','v_', LETTERS[1:3], perl=TRUE)
:   [1] "A\0J" "B\0P" "C\0J"
:   > sub('^','v_', LETTERS[1:3], perl=TRUE)
:   [1] "A\0\0" "B\0\0" "C\0m" 
:   >
:  i.e., the result is random nonsense.
: Note that this happens both for R-patched (2.0.1)  and R-devel (2.1.0 alpha).
: ==> "forwarded" as bug report to R-bugs

Also consider the following which may be related.  #1 does not
place an X before the first word and #2 causes R to hang.

R> R.version.string # Windows XP
[1] "R version 2.1.0, 2005-03-17"

R> gsub("\\b", "X", "The quick brown fox") # 1
[1] "The Xquick Xbrown Xfox"

R> gsub("\\b", "X", "The quick brown fox", perl = TRUE) # 2
... hangs ...

