[R] Regular expression help
Marc Schwartz
marc_schwartz at me.com
Tue Dec 8 00:15:35 CET 2009
On Dec 7, 2009, at 5:04 PM, Ramya wrote:
>
> Hi there
>
> I have a string like this i want to extract 9831019 from this string
> i used
> a regular expresion \d+ by which i can only make it to see 7 and
> returns.
> This type of number(9831019) appears in any part of the string and is
> definitely more than 5 digits all the time and i want to give that
> as a
> condition
>
> UV7C11-F9-E1 MCS#9831019
> MCS Lot #9512516"
>
>
> how do i go abt it
>
> Ramya
Is the double quote actually part of your data or just a typo?
I am not sure that it might matter in the end, but here is one approach:
> x
[1] "UV7C11-F9-E1 MCS#9831019" "MCS Lot #9512516\""
Note that I have the double quote included in the second value, which
is escaped when printed here.
> gsub("^.*#([0-9]*).*$", "\\1", x)
[1] "9831019" "9512516"
This uses gsub() to extract the value within the parens in the regex
using a back reference.
Any characters from the beginning of the line to the '#' are dropped,
as are any characters after the numeric sequence to the end of the line.
See ?gsub for more information.
HTH,
Marc Schwartz
More information about the R-help
mailing list