[R] regular expression question

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Wed Mar 4 18:49:16 CET 2009


Greg Snow wrote:
> Here is another approach that still uses strspit if you want to stay with that:
>
>   
>> tmp <- '(-0.791,-0.263].(-38,-1.24].(0.96,2.43]'
>> strsplit(tmp, '\\.(?=\\()', perl=TRUE)
>>     
> [[1]]
> [1] "(-0.791,-0.263]" "(-38,-1.24]"     "(0.96,2.43]"   
>
> This uses the Perl 'look-ahead' indicator to say only match on a period that is followed by a '(', but don't include the '(' in the match.
>   

right;  you could extend this pattern to split the string by every dot
that does not separate two digits, for example:
   
    strsplit(tmp, '(?<!\\d)\\.(?!\\d)', perl=TRUE)

of course, this fails if there are numbers without a leading zero, e.g., .11

vQ




More information about the R-help mailing list