[R] regex pattern assistance
Marc Schwartz
marc_schwartz at me.com
Fri Aug 15 18:41:42 CEST 2014
On Aug 15, 2014, at 11:18 AM, Tom Wright <tom at maladmin.com> wrote:
> Hi,
> Can anyone please assist.
>
> given the string
>
>> x<-"/mnt/AO/AO Data/S01-012/120824/"
>
> I would like to extract "S01-012"
>
> require(stringr)
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(.+)\\/+")
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(\\w+)\\/+")
>
> both nearly work. I expected I would use something like:
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/([\\w -]+)\\/+")
>
> but I don't seem able to get the square bracket grouping to work
> correctly. Can someone please show me where I am going wrong?
>
> Thanks,
> Tom
Is the desired substring always in the same relative position in the path?
If so:
> strsplit(x, "/")
[[1]]
[1] "" "mnt" "AO" "AO Data" "S01-012" "120824"
> unlist(strsplit(x, "/"))[5]
[1] "S01-012"
Alternatively, again, presuming the same position:
> gsub("/mnt/AO/AO Data/([^/]+)/.+", "\\1", x)
[1] "S01-012"
You don't need all of the double backslashes in your regex above. The '/' character is not a special regex character, whereas '\' is and needs to be escaped.
Regards,
Marc Schwartz
More information about the R-help
mailing list