[R] Regex for ^ (the caret symbol)?

Duncan Murdoch murdoch.duncan at gmail.com
Mon Jan 21 20:07:34 CET 2013


On 13-01-21 1:05 PM, Jeff Newmiller wrote:
> So what is the special behavior of the ^ symbol when not at the beginning of the string that occurs when it is not escaped?

I think it retains its meaning as an assertion that it occurs at the 
beginning of the line, and so a pattern like "a^b" could never match 
anything.  It's not very useful in this context, but I expect it's 
easier to implement in the case of complicated patterns, where some 
paths through the pattern put it at the beginning and others don't, e.g.

(a|)^b

has two possible patterns:  a^b and ^b.

Duncan Murdoch

> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>
>> On 13-01-21 11:48 AM, Jeff Newmiller wrote:
>>> I am not sure I understand what worked perfectly, since it is my
>> understanding that ^ is only special at the beginning of the regex (to
>> anchor the pattern at the beginning of the target string) or as the
>> first character of a character set (to indicate exclusion of the listed
>> characters). In any other position the caret should behave like an
>> ordinary character. That is, your original pattern should have worked
>> as-is. This is supported by the help page documentation for regex in
>> the paragraph below the definition of [:xdigit:]. I think this is a bug
>> in R.
>>
>> It's a documentation error rather than a bug.  The ^ character is
>> special anywhere in the extended RE syntax defined by the TRE library
>> or the Perl-compatible library that we use.  This is inconsistent with
>> the POSIX standard, which might be what you were thinking of.
>>
>> Duncan Murdoch
>>
>>
>>
>>>
>> ---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>> Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>> Go...
>>>                                         Live:   OO#.. Dead: OO#..
>> Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>> rocks...1k
>>>
>> ---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> mtb954 at gmail.com wrote:
>>>
>>>> Hi Tsjerk, many thanks...that worked perfectly!
>>>>
>>>> Mark Na
>>>>
>>>>
>>>>
>>>> On Mon, Jan 21, 2013 at 9:36 AM, Tsjerk Wassenaar
>> <tsjerkw at gmail.com>
>>>> wrote:
>>>>
>>>>> Oh, I'm jetlagged. ^ is a control character for 'start of string'.
>> In
>>>> the
>>>>> context of a character set it means negation: [^a-z].
>>>>>
>>>>> Ciao,
>>>>>
>>>>> Tsjerk
>>>>>
>>>>>
>>>>> On Mon, Jan 21, 2013 at 4:33 PM, Tsjerk Wassenaar
>>>> <tsjerkw at gmail.com>wrote:
>>>>>
>>>>>> Hi Mark Na,
>>>>>>
>>>>>> Try:
>>>>>>
>>>>>> grepl("latitude\\^2",temp)
>>>>>>
>>>>>> ^ is a control character for negation, so you have to escape it.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Tsjerk
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 21, 2013 at 4:26 PM, <mtb954 at gmail.com> wrote:
>>>>>>
>>>>>>> Hello R-helpers,
>>>>>>>
>>>>>>> I am trying to search for string that includes the caret symbol,
>>>> using
>>>>>>> the
>>>>>>> following code:
>>>>>>>
>>>>>>> grepl("latitude^2",temp)
>>>>>>>
>>>>>>>
>>>>>>> And R doesn't like that. It gives me:
>>>>>>>
>>>>>>>> temp<-c("latitude^2","latitude and
>>>> latitude^2","longitude^2","longitude
>>>>>>> and longitude^2")
>>>>>>>> temp
>>>>>>> [1] "latitude^2"                "latitude and latitude^2"
>>>> "longitude^2"
>>>>>>>               "longitude and longitude^2"
>>>>>>>> grepl("latitude^2",temp)
>>>>>>> [1] FALSE FALSE FALSE FALSE
>>>>>>>
>>>>>>>
>>>>>>> I think this must a regex problem, but I can't find out to
>> specify
>>>> the
>>>>>>> caret using regex.
>>>>>>>
>>>>>>> I would appreciate any help you could provide.
>>>>>>>
>>>>>>> Many thanks,
>>>>>>>
>>>>>>> Mark Na
>>>>>>>
>>>>>>>           [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible
>> code.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Tsjerk A. Wassenaar, Ph.D.
>>>>>>
>>>>>> post-doctoral researcher
>>>>>> Biocomputing Group
>>>>>> Department of Biological Sciences
>>>>>> 2500 University Drive NW
>>>>>> Calgary, AB T2N 1N4
>>>>>> Canada
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Tsjerk A. Wassenaar, Ph.D.
>>>>>
>>>>> post-doctoral researcher
>>>>> Biocomputing Group
>>>>> Department of Biological Sciences
>>>>> 2500 University Drive NW
>>>>> Calgary, AB T2N 1N4
>>>>> Canada
>>>>>
>>>>
>>>> 	[[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>



More information about the R-help mailing list