[R] platform dependent regex

Ista Zahn istazahn at gmail.com
Tue Feb 9 18:39:00 CET 2016


Hi Jim,

Bah, yes, I meant,

## Windows:
grepl("\\W",  "س")  # TRUE

## OS X:
grepl("\\W",  "س")  # TRUE

## Linux:
grepl("\\W", "س")  # FALSE

Sorry about that. My original example was with gsub, but I thought
changing to grepl example was clearer. Thank you.

-- Ista

On Tue, Feb 9, 2016 at 12:10 PM, jim holtman <jholtman at gmail.com> wrote:
> why 3 parameters on the 'grepl'?  Did you mean to say:
>
> grepl("\\W", "س")  # FALSE
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Tue, Feb 9, 2016 at 11:55 AM, Ista Zahn <istazahn at gmail.com> wrote:
>>
>> I just spent a day and a half debugging someone's code, only to
>> discover that the problem is platform dependent regular expressions.
>> For example:
>>
>> ## Windows:
>> grepl("\\W", "", "س")  # TRUE
>>
>> ## OS X:
>> grepl("\\W", "", "س")  # TRUE
>>
>> ## Linux:
>> grepl("\\W", "", "س")  # FALSE
>>
>> Ouch. The documentation does say "Certain named classes of characters
>> are predefined.  Their interpretation depends on the _locale_", but
>> that doesn't seem to cover it given that the locale on OS X and Linux
>> was the same (en_US.UTF-8).
>>
>> Question: Is this considered a bug, and if so what can I do to help
>> fix it? I've checked and the issue is present in both r-patched and
>> r-devel.
>>
>> Best,
>> Ista
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list