[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Tue Jun 9 09:53:57 CEST 2009
Wacek Kusnierczyk wrote:
> Marc Schwartz wrote:
>
>> On Jun 8, 2009, at 5:27 PM, Barry Rowlingson wrote:
>>
>>
>>> On Mon, Jun 8, 2009 at 10:40 PM, Tan, Richard<RTan at panagora.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> This is not exactly an R question but I am trying to use gsub to
>>>> replace
>>>> a string that contains 5-9 alpha-numeric characters, at least one of
>>>> which is a number. Is there a good way to write it in a one line
>>>> regex?
>>>>
>>> The only way I can think of is to spell out all the possible
>>> expressions, somethinglike:
>>>
>>> [0-9][a-z0-9]{4} | [a-z0-9][0-9][a-z0-9]{3} |
>>> [a-z0-9]{2}[0-9][a-z0-9]{2} .... and so on. That is, have a regex
>>> component for every possible 5, 6, 7, 8, and 9 character expression
>>> with [0-9] in each place. I'm not sure this qualifies as 'good',
>>> though..
>>>
>
> something like this?
>
> input = c(
> none='0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo',
> all='foob0 foo0b 0foob 0foobardo foob4rdoo foobardo0')
>
> gsub(x=input, replacement='x', perl=TRUE,
> pattern=paste(collapse='|',
>
> sprintf('\\b[[:alpha:]-]{%d}[[:digit:]][[:alpha:]]{%d,%d}\\b',
of course it should have been (no minus):
'\\b[[:alpha:]]{%d}[[:digit:]][[:alpha:]]{%d,%d}\\b'
vQ
> 0:8,
> c(4:0, rep(0,4)), 8:0)))
> # none -> '0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo'
> # all -> 'x x x x x x'
More information about the R-help
mailing list