[R] gsub warning message

Greg Snow Greg.Snow at intermountainmail.org
Fri Aug 31 20:41:37 CEST 2007


What is happening is that before the regex engine can look at your
pattern, the R string parsing routines first process your input as a
string.  In the string processing there are certain things represented
using a backslash.  Try this code in R:

> cat('here\tthere\n')

The \t is made into a tab and the \n is made into a newline.  If you
want the actuall backslash you need \\:

> cat('here\\tthere\n')

So if you want the regex engine to see \. (which means a literal dot)
then you need to say \\. So that the string processing sees \\ and
converts it to \ to pass to the regex engine.  If you say \. Then it
looks in its table where it knows what to do with \t, \n, and others,
but \. Is not there (it is meaningful to regexs but not string
proccessing), so gives you the warning.  For your example you are using
it in the replacement portion where the \ in front of . Does not do
anything, which is why either works.  If you are using it in the pattern
to match, then \\. (which gets reduced to \.) matches a . (dot
character) while . (without \) matches any single character (with some
possible exceptions), so in some cases it may give different results.

Hope this helps,



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Talbot Katz
> Sent: Friday, August 31, 2007 12:30 PM
> To: ligges at statistik.uni-dortmund.de
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] gsub warning message
> 
> Thank you for the swift response.  It looks like the code 
> works the same way with or without the "\\" in either the 
> search string: { "\\_+" or "_+" }  or the replacement string: 
> { "\\." or "." }.  I tested this in Windows and Linux 
> (although we're still on R 2.4.1 in Linux).  It's not clear 
> to me why I can use either two slashes or no slash safely, 
> but not one slash, and it makes me vaguely uneasy.  
> Obviously, I need to review regular expressions, but my usual 
> sources, such as http://perldoc.perl.org/perlre.html, don't 
> seem to address this issue.  I wonder whether there's a good 
> document explaining this.
> 
> --  TMK  --
> 212-460-5430	home
> 917-656-5351	cell
> 
> 
> >From: Uwe Ligges <ligges at statistik.uni-dortmund.de>
> >To: Talbot Katz <topkatz at msn.com>
> >CC: r-help at stat.math.ethz.ch
> >Subject: Re: [R] gsub warning message
> >Date: Fri, 31 Aug 2007 18:04:39 +0200
> >
> >
> >
> >Talbot Katz wrote:
> >>Hi.
> >>
> >>I am using R 2.5.1 on a Windows XP machine.  Here is an 
> example of a 
> >>piece of code I was running in older versions of R on the same 
> >>machine.  I am looking for underscores and replacing them with 
> >>periods.  This result is from R 2.4.1:
> >>
> >>>gsub ( "\\_+","\.","AAA_I")
> >>[1] "AAA.I"
> >>
> >>Here is what I get in R 2.5.1:
> >>
> >>>gsub ( "\\_+","\.","AAA_I")
> >>[1] "AAA.I"
> >>Warning messages:
> >>1: '\.' is an unrecognized escape in a character string
> >>2: unrecognized escape removed from "\."
> >>
> >>I still get the same result, which is what I want, but now I get a 
> >>warning message.  Am I actually doing something wrong that the 
> >>previous versions of R didn't warn me about?  Or is this warning 
> >>message unwarranted?  Is there a fully approved method for 
> getting the same functionality?  Thanks!
> >
> >Yes, correct usage is either
> >   gsub ( "\\_+", ".", "AAA_I")
> >or
> >   gsub ( "\\_+", "\\.", "AAA_I")
> >
> >Uwe Ligges
> >
> >
> >
> >>--  TMK  --
> >>212-460-5430	home
> >>917-656-5351	cell
> >>
> >>______________________________________________
> >>R-help at stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list