[R] Removing a dollar sign from a character vector
William Dunlap
wdunlap at tibco.com
Thu Feb 11 18:52:34 CET 2016
I should have said that R-3.2.3 requires the $ to be backslashed even when
it
is not at the end of the pattern:
> gsub("$[[:digit:]]*", "<money>", c("$VAR", "$20/oz."))
[1] "$VAR<money>" "$20/oz.<money>"
> gsub("\\$[[:digit:]]*", "<money>", c("$VAR", "$20/oz."))
[1] "<money>VAR" "<money>/oz."
Modern Linuxen's tools like sed do not seem to have this requirement.
% echo '$VAR' '$20/oz.' | sed -e 's/$[0-9]*/<money>/g'
<money>VAR <money>/oz.
% echo '$VAR' '$20/oz.' | sed -e 's/\$[0-9]*/<money>/g'
<money>VAR <money>/oz.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Feb 11, 2016 at 9:30 AM, William Dunlap <wdunlap at tibco.com> wrote:
> In certain programs (not current R), a pattern with stuff after a naked
> dollar
> sign would not match anything because dollar meant end-of-string.
>
> In any case I prefer simple rules like 'backslash a dollar sign' instead of
> 'backslash a dollar sign at the end of the pattern but not elsewhere'.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
>
>> The "end of string" special meaning only applies when the dollar sign is
>> at the right end of the string (as it was in the OP attempt). That is, it
>> is NOT generally necessary to wrap it in brackets to remove the special
>> meaning unless it would otherwise be at the end of the pattern string.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help <
>> r-help at r-project.org> wrote:
>>
>>> y
>>>>
>>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>
>>>> gsub("$", "", y)
>>>>
>>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>>> no change. Why?
>>>
>>> "$" as a regular expression means "end of string", which has zero length -
>>> replacing "end
>>> of string" with nothing does not affect the string. Try gsub("$",
>>> "DOLLAR", "$100")
>>> to see it do something.
>>>
>>> Use either fixed=TRUE so the 'pattern' argument is not regarded as a
>>> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's special
>>> meaning in the pattern language.
>>>
>>> Read up on regular expressions (probably there is a See Also
>>> entry in
>>> help(gsub)).
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Wed, Feb 10, 2016 at 9:39 PM, James Plante <jimplante at me.com> wrote:
>>>
>>> What I’ve got:
>>>> # sessionInfo()
>>>> R version 3.2.3 (2015-12-10)
>>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>> Running under: OS X 10.11.3 (El Capitan)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods base
>>>>
>>>> other attached packages:
>>>> [1] XML_3.98-1.3 dplyr_0.4.3
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] magrittr_1.5 R6_2.1.2 assertthat_0.1 rsconnect_0.4.1.4
>>>> [5] parallel_3.2.3 DBI_0.3.1 tools_3.2.3
>>>> Rcpp_0.12.3
>>>>
>>>> str(y) #toy vector, subset of larger vector in a dataframe of ~4,600
>>>>>
>>>> rows.
>>>> chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “
>>>>
>>>> y is a subset of a column in a dataframe that’s too big to post. I tried
>>>> the commands listed here on the dataframe and it didn’t work. So I’m using
>>>> a small subset to find out where my error is. It’s being a PITA, and I’m
>>>> trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000,
>>>> 2600, 2,600.
>>>>
>>>> What I’ve tried:
>>>>
>>>>> y
>>>>>
>>>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>>
>>>>> gsub("$", "", y)
>>>>>
>>>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no
>>>> change. Why?
>>>>
>>>>> gsub(".00", "", y) # note: that’s dot zero zero, replace with “"
>>>>>
>>>> [1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF?
>>>>
>>>> I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
>>>> were done in desperation; gsub ought to work the way the help says.) I’ve
>>>> tried lots more than is listed here, over and over, with no results. I’d be
>>>> grateful for any guidance you can provide.
>>>>
>>>> Thanks in advance,
>>>>
>>>> Jim Plante
>>>>
>>>> ------------------------------
>>>>
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ------------------------------
>>>
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list