[R] gsub issue in R 2.11.1, but not present in 2.9.2
Bert Gunter
gunter.berton at gene.com
Tue Jun 29 20:07:32 CEST 2010
Jason:
I think it's actually even a bit worse than what Duncan said, which was:
-----------
"You need to double the backslashes to enter them in an R string. So
gsub("N\\A", "NA", original, fixed=TRUE)
should work if original contains a single backslash, and
gsub("N\\\\A", "NA", original, fixed=TRUE)
should work if it contains a double one. Two things add to the confusion
here: First, a single backslash will be displayed doubled by print(). .. "
------
Well, let's see: (On R version 2.11.1, 2010-5-31 for Windows)
> astring <- "n\a"
> print(astring)
[1] "n\a"
So Duncan's last sentence appears to be incorrect. The "\" is not displayed
doubled. However ...
> bstring <- "N\A"
Error: '\A' is an unrecognized escape in character string starting "N\A"
What's going on? Well, the "\a" in astring is a _single escape sequence (for
a beep/bell sound, on Windows anyway: cat("\a") should make a sound). So the
"\" in "\a" is printed as correctly undoubled. However, since the "\A" in
bstring does _not_ correspond to any escape sequence, the expression "\A"
cannot be parsed and an error is thrown. But:
> bstring <- "N\\A"
> print(bstring)
[1] "N\\A" ## is fine
## ... Noting that
> nchar("\\A")
[1] 2
So whether a "\" needs to be doubled or not depends on whether the parser
can interpret it as part of a legitimate escape sequence, whence
gsub("\a","","\a") ## works but
gsub("\A","","\A") ## does not.
To avoid such confusion, I think Duncan's advice to double backslashes
should be heeded as much as possible. Unfortunately, I don't think it's
always possible:
> newlineString <- "first line\nsecond line\n"
> print(newlineString)
[1] "first line\nsecond line\n"
> cat(newlineString)
first line
second line
Cheers,
Bert
Bert Gunter
Genentech Nonclinical Statistics
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Uwe Ligges
> Sent: Tuesday, June 29, 2010 4:11 AM
> To: Jason Rupert
> Cc: r-help at r-project.org
> Subject: Re: [R] gsub issue in R 2.11.1, but not present in 2.9.2
>
>
>
> On 29.06.2010 12:47, Jason Rupert wrote:
> > Previously in R 2.9.2 I used the following to convert from an improperly
> formatted NA string into one that is a bit more consistent.
> >
> >
> > gsub("N\A", "NA", "N\A", fixed=TRUE)
> >
> > This worked in R 2.9.2, but now in R 2.11.1 it doesn't seem to work an
> throws the following error.
> > Error: '\A' is an unrecognized escape in character string starting "N\A"
> >
> > I guess my questions are the following:
> > (1) Is this expected behavior?
> > (2) If it is expected behavior, what is the proper way to replace "N\A"
> with "NA" and "N\\A" with "NA"?
>
>
> If your original text "thestring" contains "N\A", then the R
> representation is "N\\A", and hence
>
> gsub("N\\A", "NA", thestring)
>
> If you want to try explicitly, you need to write
>
> gsub("N\\A", "NA", "N\\A")
>
> If you original text contains two backslashes, both have to be escaped as
> in
>
> gsub("N\\\\A", "NA", thestring)
>
> Uwe Ligges
>
>
> > Thank you again for all the help and insight.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list