[R] Removing a dollar sign from a character vector

William Dunlap wdunlap at tibco.com
Thu Feb 11 18:52:34 CET 2016


I should have said that R-3.2.3 requires the $ to be backslashed even when
it
is not at the end of the pattern:

  > gsub("$[[:digit:]]*", "<money>", c("$VAR", "$20/oz."))
  [1] "$VAR<money>"    "$20/oz.<money>"
  > gsub("\\$[[:digit:]]*", "<money>", c("$VAR", "$20/oz."))
  [1] "<money>VAR"  "<money>/oz."

Modern Linuxen's tools like sed do not seem to have this requirement.
  % echo '$VAR' '$20/oz.' | sed -e 's/$[0-9]*/<money>/g'
  <money>VAR <money>/oz.
  % echo '$VAR' '$20/oz.' | sed -e 's/\$[0-9]*/<money>/g'
  <money>VAR <money>/oz.




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Feb 11, 2016 at 9:30 AM, William Dunlap <wdunlap at tibco.com> wrote:

> In certain programs (not current R), a pattern with stuff after a naked
> dollar
> sign would not match anything because dollar meant end-of-string.
>
> In any case I prefer simple rules like 'backslash a dollar sign' instead of
> 'backslash a dollar sign at the end of the pattern but not elsewhere'.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
>
>> The "end of string" special meaning only applies when the dollar sign is
>> at the right end of the string (as it was in the OP attempt). That is, it
>> is NOT generally necessary to wrap it in brackets to remove the special
>> meaning unless it would otherwise be at the end of the pattern string.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help <
>> r-help at r-project.org> wrote:
>>
>>>  y
>>>>
>>>    [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>
>>>>  gsub("$", "", y)
>>>>
>>>    [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>>> no change. Why?
>>>
>>> "$" as a regular expression means "end of string", which has zero length -
>>> replacing "end
>>> of string" with nothing does not affect the string.  Try gsub("$",
>>> "DOLLAR", "$100")
>>> to see it do something.
>>>
>>> Use either fixed=TRUE so the 'pattern'  argument is not regarded as a
>>> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's special
>>> meaning in the pattern language.
>>>
>>> Read up on regular expressions (probably there is a See Also
>>> entry in
>>> help(gsub)).
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Wed, Feb 10, 2016 at 9:39 PM, James Plante <jimplante at me.com> wrote:
>>>
>>>  What I’ve got:
>>>>  # sessionInfo()
>>>>  R version 3.2.3 (2015-12-10)
>>>>  Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>>  Running under: OS X 10.11.3 (El Capitan)
>>>>
>>>>  locale:
>>>>  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>>  attached base packages:
>>>>  [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>>  other attached packages:
>>>>  [1] XML_3.98-1.3 dplyr_0.4.3
>>>>
>>>>  loaded via a namespace (and not attached):
>>>>  [1] magrittr_1.5      R6_2.1.2          assertthat_0.1    rsconnect_0.4.1.4
>>>>  [5] parallel_3.2.3    DBI_0.3.1         tools_3.2.3
>>>> Rcpp_0.12.3
>>>>
>>>>  str(y) #toy vector, subset of larger vector in a dataframe of ~4,600
>>>>>
>>>>  rows.
>>>>   chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “
>>>>
>>>>  y is a subset of a column in a dataframe that’s too big to post. I tried
>>>>  the commands listed here on the dataframe and it didn’t work. So I’m using
>>>>  a small subset to find out where my error is. It’s being a PITA, and I’m
>>>>  trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000,
>>>>  2600, 2,600.
>>>>
>>>>  What I’ve tried:
>>>>
>>>>>  y
>>>>>
>>>>  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>>
>>>>>  gsub("$", "", y)
>>>>>
>>>>  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no
>>>>  change. Why?
>>>>
>>>>>  gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"
>>>>>
>>>>  [1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?
>>>>
>>>>  I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
>>>>  were done in desperation; gsub ought to work the way the help says.) I’ve
>>>>  tried lots more than is listed here, over and over, with no results. I’d be
>>>>  grateful for any guidance you can provide.
>>>>
>>>>  Thanks in advance,
>>>>
>>>>  Jim Plante
>>>>
>>>> ------------------------------
>>>>
>>>>  R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>>>  PLEASE do read the posting guide
>>>>  http://www.R-project.org/posting-guide.html
>>>>  and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> ------------------------------
>>>
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list