[R] regex pattern assistance

Marc Schwartz marc_schwartz at me.com
Fri Aug 15 19:25:32 CEST 2014


On Aug 15, 2014, at 11:56 AM, Tom Wright <tom at maladmin.com> wrote:

> WOW!!!
> 
> What can I say 4 answers in less than 4 minutes. Thank you everyone. If
> I can't make it work now I don't deserve to. 
> 
> btw. the strsplit approach wouldn't work for me as:
> a) I wanted to play with regex and 
> b) the location isn't consistent.


Tom,

If not in the same relative position, is the substring pattern always the same? That is 3 characters, a hyphen, then 3 characters? If so, would any other part of the path follow the same pattern or is it unique?

If the pattern is the same and is unique in the path:

> gsub(".*([[:alnum:]]{3}-[[:alnum:]]{3}).*", "\\1", x)
[1] "S01-012"


is another possible alternative and more flexible:

y <- "/mnt/AO/AO Data/Another Level/Yet Another Level/S01-012/120824/"

> gsub(".*([[:alnum:]]{3}-[[:alnum:]]{3}).*", "\\1", y)
[1] "S01-012"


z <- "/mnt/AO/AO Data/Another Level/Yet Another Level/S01-012/One More Level/120824/"

> gsub(".*([[:alnum:]]{3}-[[:alnum:]]{3}).*", "\\1", z)
[1] "S01-012"


> 
> Nice to see email support still works, not everything has moved to
> linkedin and stackoverflow.


Stackoverflow?  ;-)

Regards,

Marc


> 
> 
> Thanks again,
> Tom
> 
> 
> On Fri, 2014-08-15 at 12:18 -0400, Tom Wright wrote:
>> Hi,
>> Can anyone please assist.
>> 
>> given the string 
>> 
>>> x<-"/mnt/AO/AO Data/S01-012/120824/"
>> 
>> I would like to extract "S01-012"
>> 
>> require(stringr)
>>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(.+)\\/+")
>>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(\\w+)\\/+")
>> 
>> both nearly work. I expected I would use something like:
>>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/([\\w -]+)\\/+")
>> 
>> but I don't seem able to get the square bracket grouping to work
>> correctly. Can someone please show me where I am going wrong?
>> 
>> Thanks,
>> Tom



More information about the R-help mailing list