[R] regexpr: R takes very long with non-existent pattern

Leonard Mada |eo@m@d@ @end|ng |rom @yon|c@eu
Thu May 19 02:08:24 CEST 2022


Dear R Users,


I have run the following command in R:

# x = larger vector of strings (1200 Pubmed abstracts);
# patt = not defined;
npos = regexpr(patt, x, perl=TRUE);
# Error in regexpr(patt, x, perl = TRUE) : object 'patt' not found


The problem:

R becomes unresponsive and it takes 1-2 minutes to return the error. The 
operation completes almost instantaneously with a valid pattern.

Is there a reason for this behavior?

Tested with R 4.2.0 on MS Windows 10.


I have uploaded a set with 1200 Pubmed abstracts on Github, if anyone 
wants to check:

- see file: Example_Abstracts_Title_Pubmed.csv;

https://github.com/discoleo/R/tree/master/TextMining/Pubmed

The variable patt was not defined due to an error: but it took very long 
to exit the operation and report the error.


Many thanks,


Leonard



More information about the R-help mailing list