[R] regular expression help
Enrico Schumann
es at enricoschumann.net
Thu Jun 8 13:41:24 CEST 2017
Zitat von Ashim Kapoor <ashimkapoor at gmail.com>:
> Dear All,
>
> My query is:
>
> Do we always need to use perl = TRUE option when doing ignore.case=TRUE?
>
> A small example :
>
> my_text =
> "RECOVERY OFFICER-II\nDEBTS RECOVERY TRIBUNAL-III\n RC No. 162/2015\nSBI
> VS RAMESH GUPTA.\n Dated: 01.03.2016 Item no.01\n
> Present: Ms. Sonakshi, the proxy counsel for Ms. Usha Singh, the counsel
> for ARCIL.\n None for the CDs.\n The counsel for the CHFI
> submitted that the matter has been assigned to ARCIL and deed of
> assignment, application for substituting the name and vakalatnama has been
> filed vide diary no. 1454 dated 08.02.2016\nIn the application it has been
> prayed that ARCIL may be substituted in place of SBI for the purpose of
> further proceedings in the matter. Request allowed.\nThe proxy counsel for
> CHFI further requested to issue demand notice thereby mentioning the name
> of ARCIL. Request allowed.\nRegistry is directed to issue fresh demand
> notice mentioning the name of ARCIL.\nCHFI is directed to file status of
> the mortgaged property as well as other assets of the CDs.\nList the case
> on 28.03.2016.\n (SUJEET KUMAR)\nRECOVERY OFFICER-II."
>
> My regular expression is:
>
> parties_present_start_1=
> regexpr("\n.*Present.*\n.*\n",my_text,ignore.case=TRUE,perl=T)
>
> parties_present_start_2=
> regexpr("\n.*Present.*\n.*\n",my_text,ignore.case=TRUE)
>
>> parties_present_start_1
> [1] 138
> attr(,"match.length")
> [1] 123
> attr(,"useBytes")
> [1] TRUE
>> parties_present_start_2
> [1] 20
> attr(,"match.length")
> [1] 949
> attr(,"useBytes")
> [1] TRUE
>>
>
> Why do I see the correct result only in the first case?
>
> Best Regards,
> Ashim
>
In Perl, '.' matches anything but a newline.
In R, '.' matches any character.
test <- "hello\n1"
regexpr(".*[0-9]", test)
## [1] 1
## attr(,"match.length")
## [1] 7
## attr(,"useBytes")
## [1] TRUE
regexpr(".*[0-9]", test, perl = TRUE)
## [1] 7
## attr(,"match.length")
## [1] 1
## attr(,"useBytes")
## [1] TRUE
--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net
More information about the R-help
mailing list