[R] Regex for ^ (the caret symbol)?

David Winsemius dwinsemius at comcast.net
Mon Jan 21 19:55:11 CET 2013


On Jan 21, 2013, at 10:05 AM, Jeff Newmiller wrote:

> So what is the special behavior of the ^ symbol when not at the  
> beginning of the string that occurs when it is not escaped?

Isn't there a distinction between what _is_ "special" and what should  
be "special". You are saying that "^" after the beginning of a pattern  
should not be special, and by extension that "$" before the end of a  
pattern should not be special. What about the potential desire to have  
a regex "conjunction" that picks from one of two patterns that are at  
the beginning of a target? Doesn't "^" need to remain special to allow  
this:

 > grep("^thet|^that",  c("thet is", "that is"))
[1] 1 2

> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go  
> Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.   
> Live Go...
>                                      Live:   OO#.. Dead: OO#..   
> Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.   
> rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>
>> On 13-01-21 11:48 AM, Jeff Newmiller wrote:
>>> I am not sure I understand what worked perfectly, since it is my
>> understanding that ^ is only special at the beginning of the regex  
>> (to
>> anchor the pattern at the beginning of the target string) or as the
>> first character of a character set (to indicate exclusion of the  
>> listed
>> characters). In any other position the caret should behave like an
>> ordinary character. That is, your original pattern should have worked
>> as-is. This is supported by the help page documentation for regex in
>> the paragraph below the definition of [:xdigit:]. I think this is a  
>> bug
>> in R.
>>
>> It's a documentation error rather than a bug.  The ^ character is
>> special anywhere in the extended RE syntax defined by the TRE library
>> or the Perl-compatible library that we use.  This is inconsistent  
>> with
>> the POSIX standard, which might be what you were thinking of.
>>
>> Duncan Murdoch
>>
>>
>>
>>>
>> ---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>> Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.   
>>> Live
>> Go...
>>>                                       Live:   OO#.. Dead: OO#..
>> Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.   
>>> with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>> rocks...1k
>>>
>> ---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> mtb954 at gmail.com wrote:
>>>
>>>> Hi Tsjerk, many thanks...that worked perfectly!
>>>>
>>>> Mark Na
>>>>
>>>>
>>>>
>>>> On Mon, Jan 21, 2013 at 9:36 AM, Tsjerk Wassenaar
>> <tsjerkw at gmail.com>
>>>> wrote:
>>>>
>>>>> Oh, I'm jetlagged. ^ is a control character for 'start of string'.
>> In
>>>> the
>>>>> context of a character set it means negation: [^a-z].
>>>>>
>>>>> Ciao,
>>>>>
>>>>> Tsjerk
>>>>>
>>>>>
>>>>> On Mon, Jan 21, 2013 at 4:33 PM, Tsjerk Wassenaar
>>>> <tsjerkw at gmail.com>wrote:
>>>>>
>>>>>> Hi Mark Na,
>>>>>>
>>>>>> Try:
>>>>>>
>>>>>> grepl("latitude\\^2",temp)
>>>>>>
>>>>>> ^ is a control character for negation, so you have to escape it.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Tsjerk
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 21, 2013 at 4:26 PM, <mtb954 at gmail.com> wrote:
>>>>>>
>>>>>>> Hello R-helpers,
>>>>>>>
>>>>>>> I am trying to search for string that includes the caret symbol,
>>>> using
>>>>>>> the
>>>>>>> following code:
>>>>>>>
>>>>>>> grepl("latitude^2",temp)
>>>>>>>
>>>>>>>
>>>>>>> And R doesn't like that. It gives me:
>>>>>>>
>>>>>>>> temp<-c("latitude^2","latitude and
>>>> latitude^2","longitude^2","longitude
>>>>>>> and longitude^2")
>>>>>>>> temp
>>>>>>> [1] "latitude^2"                "latitude and latitude^2"
>>>> "longitude^2"
>>>>>>>             "longitude and longitude^2"
>>>>>>>> grepl("latitude^2",temp)
>>>>>>> [1] FALSE FALSE FALSE FALSE
>>>>>>>
>>>>>>>
>>>>>>> I think this must a regex problem, but I can't find out to
>> specify
>>>> the
>>>>>>> caret using regex.
>>>>>>>
>>>>>>> I would appreciate any help you could provide.
>>>>>>>
>>>>>>> Many thanks,
>>>>>>>
>>>>>>> Mark Na
>>>>>>>
>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible
>> code.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Tsjerk A. Wassenaar, Ph.D.
>>>>>>
>>>>>> post-doctoral researcher
>>>>>> Biocomputing Group
>>>>>> Department of Biological Sciences
>>>>>> 2500 University Drive NW
>>>>>> Calgary, AB T2N 1N4
>>>>>> Canada
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Tsjerk A. Wassenaar, Ph.D.
>>>>>
>>>>> post-doctoral researcher
>>>>> Biocomputing Group
>>>>> Department of Biological Sciences
>>>>> 2500 University Drive NW
>>>>> Calgary, AB T2N 1N4
>>>>> Canada
>>>>>
>>>>
>>>> 	[[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA



More information about the R-help mailing list