Re-using Gabor's suggestion from yesterday, I think the regex incantation
gsub("[^[:digit:].]+"," ",x)
also will do, where x is a your vector of strings. It says to replace runs
of everything but digits and . with a single space.
> x
[1] "this Item costs 3.32 Dollars or maybe 10.00 cents"
> gsub("[^[:digit:].]+"," ",x)
[1] " 3.32 10.00 "
You can then "pipe" this through a textConnection to convert it to numeric:
> scan(textConnection(gsub("[^[:digit:].]+"," ",x)))
Read 2 items
[1] 3.32 10.00
Hi,
On Aug 26, 2009, at 6:38 PM, Martin Batholdy wrote:
> hi,
>
> is there an easy way to extract numbers from a string?
>
> for example I have;
> "this Item costs 3.32 Dollars"
>
> is there an easy way to extract the 3.32 as a number?
Regular expressions to the rescue?
Perhaps you'll need to fine tune it, but see here:
R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars", perl=T)
[[1]]
[1] 17
attr(,"match.length")
[1] 4
R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars, that
item costs 10.12 dollars", perl=T)
[[1]]
[1] 17 47
attr(,"match.length")
[1] 4 5
R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars, that
item costs 10 dollars even, ", perl=T)
[[1]]
[1] 17 47
attr(,"match.length")
[1] 4 2
R> gregexpr("(\\d+(\\.\\d+)?)", "this one is free ", perl=T)
[[1]]
[1] -1
attr(,"match.length")
[1] -1
