[R] Regex for Special Characters under Grep

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jun 13 08:29:58 CEST 2008


On Thu, 12 Jun 2008, Henrik Bengtsson wrote:

> A regular set is given by "[<set>]".  The complementary set is given
> by "[^<set>]" where <set> is a set of symbols.  I don't think you have
> to escape symbols in <set> (but I might be wrong).

This covered in ?regexp.  The metacharacters in character classes (the 
official name for your 'regular set') are ^]-\.

> In any case, this does what you want:
>
>> lines <- c("abc", "!abc", "#abc", "^abc", " #abc")
>> pattern <- "^[^!#^]";
>> grep(pattern, lines, value=TRUE)
> [1] "abc"   " #abc"
>
> /Henrik
>
>
> On Thu, Jun 12, 2008 at 8:06 PM, Marc Schwartz
> <marc_schwartz at comcast.net> wrote:
>> on 06/12/2008 08:42 PM Gundala Viswanath wrote:
>>>
>>> Hi all,
>>>
>>> I am trying to capture lines of a file that DO NOT
>>> start with the following header: !, #, ^
>>>
>>> But somehow my regex used under grep doesn't
>>> work.
>>>
>>> Please advice what's wrong with my code below.
>>>
>>> __BEGIN__
>>> in_fname <- paste("mydata.txt,".soft",sep="")
>>> data_for_R <- paste("data_for_R/", args[3], ".softR", sep="")
>>>
>>> # my regex construction
>>> cat(temp[-grep("^[\^\!\#]",temp,perl=TRUE)], file=data_for_R, sep="\n")
>>>
>>>
>>> dat <- read.table(data_for_R)
>>> ___END__
>>>
>>
>> You need to double the escape character when being used to differentiate
>> meta-characters in a regex. Note also that the only meta-character in your
>> sequence is the carat ('^').
>>
>> Lines <- c("! Not This Line", "# Not This Line", "^ Not This Line",
>>           "This Line")
>>
>>> Lines
>> [1] "! Not This Line" "# Not This Line" "^ Not This Line"
>> [4] "This Line"
>>
>>> grep("^[!#\\^]", Lines)
>> [1] 1 2 3
>>
>>> Lines[-grep("^[!#\\^]", Lines)]
>> [1] "This Line"
>>
>>
>> HTH,
>>
>> Marc Schwartz
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list