[R] regular expression help

C Lin baccts at hotmail.com
Mon Jun 30 16:00:49 CEST 2014


Hi, Bill

Thank you so much for your kind explanation. It's very clear too for someone like me.
I should've remember this but somehow forgot that [] have a special meaning in regular expression.

Lin

----------------------------------------
> From: wdunlap at tibco.com
> Date: Sun, 29 Jun 2014 13:16:26 -0700
> Subject: Re: [R] regular expression help
> To: baccts at hotmail.com
> CC: dwinsemius at comcast.net; r-help at r-project.org
>
>> what's the difference between [:space:]+ and[[:space:]]+ ?
>
> The pattern '[:space:]' matches any of ':', 's', 'p', 'a', 'c', and
> 'e' (the second colon is superfluous). I.e., it has no magic meaning.
> Inside of [] it does have a special meaning.
>
> The pattern '[[:space:]]' matches a space, a newline, and other
> whitespace characters. The pattern '[a-c[:space:]z[:digit:]]' matches
> 'a', 'b', 'c', any decimal digit, and any whitespace character.
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Fri, Jun 27, 2014 at 6:27 AM, C Lin <baccts at hotmail.com> wrote:
>> Thank you all for your help.
>>
>> Bill, thanks for making it compact and I did mean any amount of whitespace.
>>
>> To break it down, so I know why this pattern work:
>> The first parenthesis means that before AARSD1 it can be
>> ^: begins with nothing
>> |: or
>> //: double slash or
>> [[:space:]]+: one or more whitespace character
>>
>> For the second parenthesis:
>> $: ending with nothing
>>
>> Do this sound correct?
>>
>> I missed the fact that I need the ^ and $ and I always do [:space:]+ instead of [[:space:]]+
>> what's the difference between [:space:]+ and[[:space:]]+ ?
>>
>> Thanks so much!
>> Lin
>>
>> ----------------------------------------
>>> From: wdunlap at tibco.com
>>> Date: Fri, 27 Jun 2014 02:35:54 -0700
>>> Subject: Re: [R] regular expression help
>>> To: dwinsemius at comcast.net
>>> CC: baccts at hotmail.com; r-help at r-project.org
>>>
>>> You can use parentheses to factor out the common string in David's
>>> pattern, as in
>>> grep(value=TRUE, "(^|//|[[:space:]]+)AARSD1($|//|[[:space:]]+)", test)
>>>
>>> (By 'whitespace' I could not tell if you meant any amount of
>>> whitespace or a single
>>> whitespace character. I use '+' to match one or more whitespace characters.)
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>>
>>> On Thu, Jun 26, 2014 at 10:12 PM, David Winsemius
>>> <dwinsemius at comcast.net> wrote:
>>>>
>>>> On Jun 26, 2014, at 6:11 PM, C Lin wrote:
>>>>
>>>>> Hi Duncan,
>>>>>
>>>>> Thanks for trying to help. Sorry for not being clear.
>>>>> The string I'd like to get is 'AARSD1'
>>>>> It can be followed or preceded by white space or // or nothing
>>>>>
>>>>> so, from test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1','AARSD1');
>>>>>
>>>>> I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1'
>>>>
>>>> Perhaps you want jsut
>>>>
>>>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1', test)
>>>>
>>>>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1$', test)
>>>> [1] FALSE FALSE TRUE TRUE TRUE TRUE
>>>>
>>>> --
>>>> David.
>>>>
>>>>>
>>>>
>>>>> Thanks,
>>>>> Lin
>>>>>
>>>>> ----------------------------------------
>>>>>> From: dulcalma at bigpond.com
>>>>>> To: baccts at hotmail.com; r-help at r-project.org
>>>>>> Subject: RE: [R] regular expression help
>>>>>> Date: Fri, 27 Jun 2014 10:59:29 +1000
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> You only have a vector of length 5 and I am not quite sure of the string you
>>>>>> are testing
>>>>>> so try this
>>>>>>
>>>>>> grep('[/]*\\<AARSD1\\>[/]*',test)
>>>>>>
>>>>>> Duncan
>>>>>>
>>>>>> Duncan Mackay
>>>>>> Department of Agronomy and Soil Science
>>>>>> University of New England
>>>>>> Armidale NSW 2351
>>>>>> Email: home: mackay at northnet.com.au
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
>>>>>> Behalf Of C Lin
>>>>>> Sent: Friday, 27 June 2014 10:05
>>>>>> To: r-help at r-project.org
>>>>>> Subject: [R] regular expression help
>>>>>>
>>>>>> Dear R users,
>>>>>>
>>>>>> I need to match a string. It can be followed or preceded by whitespace or //
>>>>>> or nothing.
>>>>>> How do I code it in R?
>>>>>>
>>>>>> For example:
>>>>>> test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1');
>>>>>> grep('AARSD1(\\s*//*)',test);
>>>>>>
>>>>>> should return 3,4,5 and 6.
>>>>>>
>>>>>
>>>>
>>>>
>>>> David Winsemius
>>>> Alameda, CA, USA
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
 		 	   		  


More information about the R-help mailing list