[R] matching problem

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jun 27 17:11:22 CEST 2008


Here is a solution using strapply from the gsubfn package:

library(gsubfn)
strapply(myexstrings, "(\\w+).*", backref = -1, simplify = c)

It matches the first string of word characters following by
anything else and then returns the first backreference in
each match, i.e. the portion within parentheses, simplifying
it all into a character vector (rather than a list).

On Fri, Jun 27, 2008 at 6:23 AM, Tom.O <tom.olsson at dnbnor.com> wrote:
>
> Hi R gurus
> I have a matching problem that I cant solve. I have tried multiple solutions
> and searched varius help-sites but I cant get it to work.
>
> This is the problem
> myexstrings = c("*AAA.AA","BBB BB","*.CCC.","**dd- d")
>
> what I want do do is to remove any non-characters in the beginning and
> everything else after the non-character symbol after the first set of
> characters so that the string becomes:
>
> c("AAA","BBB","CCC","dd")
>
>
> I can figure out the start, sub("^\\W*","", myexstrings,perl=T) will remove
> the unwanted beginnings but then its the rest.
>
> And please no links to any helppages, I have been looking at most of them
> for the last hour without any success.
>
> Thanks
> Regards
> Tom
>
> --
> View this message in context: http://www.nabble.com/matching-problem-tp18152158p18152158.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list