[R] gsub issue with consecutive pattern finds
Iris Simmons
|kw@|mmo @end|ng |rom gm@||@com
Fri Mar 1 12:49:02 CET 2024
Hi Iago,
This is not a bug. It is expected. Patterns may not overlap. However, there
is a way to get the result you want using perl:
```R
gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_", "aerioue", perl = TRUE)
```
The specific change I made is called a positive lookahead, you can read
more about it here:
https://www.regular-expressions.info/lookaround.html
It's a way to check for a piece of text without consuming it in the match.
Also, since you don't care about character case, it might be more legible
to add ignore.case = TRUE and remove the upper case characters:
```R
gsub("([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE, ignore.case =
TRUE)
## or
gsub("(?i)([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE)
```
I hope this helps!
On Fri, Mar 1, 2024, 06:37 Iago Giné Vázquez <iago.gine using sjd.es> wrote:
> Hi all,
>
> I tested next command:
>
> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2", "aerioue")
>
> with the following output:
>
> [1] "a_eri_ou_e"
>
> So, there are two consecutive vowels where an underscore is not added.
>
> May it be a bug? Is it expected (bug or not)? Is there any chance to get
> what I want (an underscore between each pair of consecutive vowels)?
>
>
> Thank you!
>
> Best regards,
>
> Iago
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list