[R] regular expression strikes again
peter dalgaard
pdalgd at gmail.com
Tue Jul 9 13:50:14 CEST 2013
On Jul 9, 2013, at 12:19 , PIKAL Petr wrote:
> Thanks, it works to some extent.
>
> The test comes from some file which is not filled propperly. If I use your suggestion I get correct values for those 2 digit numbers before "," but I get some other values which do not have space before numbers.
>
>> dput(test[c(1:10,500:510)])
> c("Cl Tio2 ph 5,8 1", "Cl Tio2 ph 5,8 2", "Cl Tio2 ph 5,8 3",
> "pH5,57 1", "pH5,57 2", "pH5,57 3", "pH4,8 1", "pH4,8 2", "pH4,8 3",
> "pH4,12 1", "pH 9,36 2", "pH 9,36 3", "pH 9,66 1", "pH 9,66 2",
> "pH 9,66 3", "pH 10,04 1", "pH 10,04 2", "pH 10,04 3", "RGLP 144006 pH 6,13 1",
> "RGLP 144006 pH 6,13 2", "RGLP 144006 pH 6,13 3")
>
>> gsub("^.* ([[:digit:]]+,[[:digit:]]*).*$", "\\1", test[c(1:10,500:510)])
> [1] "5,8" "5,8" "5,8" "pH5,57 1" "pH5,57 2" "pH5,57 3"
> [7] "pH4,8 1" "pH4,8 2" "pH4,8 3" "pH4,12 1" "9,36" "9,36"
> [13] "9,66" "9,66" "9,66" "10,04" "10,04" "10,04"
> [19] "6,13" "6,13" "6,13"
>>
>
> Basically I would like to get one or two digits before comma and two digits after comma.
Then maybe
> gsub("^.*[^[:digit:]]([[:digit:]]+,[[:digit:]]*).*$", "\\1", x)
[1] "5,8" "5,8" "5,8" "5,57" "5,57" "5,57" "4,8" "4,8" "4,8"
[10] "4,12" "9,36" "9,36" "9,66" "9,66" "9,66" "10,04" "10,04" "10,04"
[19] "6,13" "6,13" "6,13"
--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list