[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Tue Jun 9 20:12:20 CEST 2009
Tan, Richard wrote:
> Sorry I did not give some examples in my previous posting to make my
> question clear. It's not exactly 1 digit, but at least one digit. Here
> are some examples:
>
>
>> input = c(none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG
>>
> NONUMBER',all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0')
>
>> gsub(x=input, replacement='x', perl=TRUE,pattern=something)
>>
>
> none
> all
> "0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER"
> "x x x x x x"
>
ok, then to my simple mind the following should do:
input = c(
none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER',
all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0 123456789')
gsub('(?=[[:alpha:]]{0,8}[[:digit:]])\\b[[:alnum:]]{5,9}\\b', 'x',
input, perl=TRUE)
# none -> '0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER',
# all -> 'x x x x x x x')
where the regex reads 'if there is ahead of you a digit following at
most 8 letters, match 5 to 9 alphanumerics (digits and/or letters).
vQ
More information about the R-help
mailing list