[R] R newbie: how to replace string/regular expression

Charles C. Berry cberry at tajo.ucsd.edu
Sun Nov 2 22:56:23 CET 2008




Gabor,

Why not just this:

 	expos <- list( B="e9", M="e6", m="e6", k="e3" )
 	as.numeric( gsubfn("[[:alpha:]]", expos, d ) )

HTH,

Chuck

p.s. I am not sure why B goes with e6 or K with e-02 (below), but 
Krishna can adjust the values accordingly.


On Sun, 2 Nov 2008, Gabor Grothendieck wrote:

> There was an error in your regexp which I did not correct. Here it is
> again corrected to better illustrate the solution:
>
>> gsubfn("(.*)B", ~ as.numeric(x) * 10e6, d, ignore.case = TRUE)
> [1] "120.0M"    "11.01m"    "2.097e+09" "100.00k"   "50"
>
> On Sun, Nov 2, 2008 at 7:55 AM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> Your gsub example is almost exactly what gsubfn in the gsubfn package
>> does.  gsubfn like gsub except the replacement string is a function:
>>
>>> library(gsubfn)
>>> gsubfn("(.*)B$", ~ as.numeric(x) * 10e6, d, ignore.case = TRUE)
>> [1] "120.0M"    "11.01m"    "2.097e+09" "100.00k"   "50"
>>
>> Also there are examples very similare to this
>>
>> 1. at the end of section 2 of
>> vignette("gsubfn")
>>
>> 2. in
>> demo("gsubfn-si")
>>
>> Also see the gsubfn home page:
>> http://gsubfn.googlecode.com
>>
>> Also note that if you want to return the values rather than
>> transform and reinsert them then strapply in the same package
>> can do that.
>>
>> On Sun, Nov 2, 2008 at 3:43 AM, Krishna Dagli/Krushna Dagli
>> <krishna.dagli at gmail.com> wrote:
>>> Hello;
>>>
>>> I am a R newbie and would like to know correct and efficient method for
>>> doing string replacement.
>>>
>>> I have a large data set, where I want to replace character "M", "b",
>>> and "K" (currency in Million, Billion and K) to  millions.  That is
>>> 209.7B with (209.7 * 10e6) and 100.00K with (100.00 *1/100)
>>> and etc..
>>>
>>> d <- c("120.0M", "11.01m", "209.7B", "100.00k", "50")
>>>
>>> This works that is it removes "b/B",
>>>
>>> gsub ("(.*)(B$)", "\\1", d, ignore.case=T, perl=T)
>>>
>>> but
>>>
>>> gsub ("(.*)(B$)", as.numeric("\\1") * 10e6, d, ignore.case=T, perl=T)
>>>
>>> does not work. I tried with sprintf and other combination of as.numeric but
>>> that fails, how to use \\1 and multiply with 10e6??
>>>
>>> The other solution is :
>>>
>>> location <- grep ("M", d, ignore.case=T)
>>> y <- sub("M", "", d, ignore.case=T)
>>> y[location]<-y[location] * 10e6
>>>
>>> Is the second solution faster or (if) combination of grep along with
>>> multiply (if it works) is faster? Or what is the most efficient method
>>> to do something like this in R?
>>>
>>> Thanks and Regards
>>> Krishna
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list