[R] String manipulation

Gabor Grothendieck ggrothendieck at gmail.com
Mon Feb 14 00:00:14 CET 2011


On Sun, Feb 13, 2011 at 4:42 PM, Megh Dal <megh700004 at gmail.com> wrote:
> Hi Gabor, thanks (and Jim as well) for your suggestion. However this is not
> working properly for following string:
>
>> MyString <- "ABCFR34564IJVEOJC3434.36453"
>> strapply(MyString, "(\\D+)(\\d+)(\\D+)(\\d+)", c)[[1]]
> [1] "ABCFR"   "34564"   "IJVEOJC" "3434"
>
> Therefore there is decimal number in the 4th group, which is numeric then
> that is not taken care off...........
>
> Similarly same kind of unintended result here as well:
>
>> MyString <- "ABCFR34564.354IJVEOJC3434.36453"
>> strapply(MyString, "(\\D+)(\\d+)(\\D+)(\\d+)", c)[[1]]
> [1] "ABCFR"   "34564"   "."       "354"     "IJVEOJC" "3434"    "."
> "36453"
> Can you please tell me how can I modify that?
>

In that case we need to tell it that a number can include a dot.
Additionally the following simplify the regular expressions by
assuming any number of non-numeric followed by numeric fields

strapply(MyString, "(\\D+)([.0-9]+)", c)[[1]]

strapply(MyString, "(\\D+)([.0-9]+)", ~ list(s1, as.numeric(s2)))[[1]]


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list