[R] element wise pattern recognition and string substitution

Jun Shen jun.shen.ut at gmail.com
Mon Sep 5 18:56:32 CEST 2016


Thanks for the reply, Bert.

Your solution solves the example. I actually have a more general situation
where I have this dot concatenated string from multiple variables. The
problem is those variables may have values with dots in there. The number
of dots are not consistent for all values of a variable. So I am thinking
to define a vector of patterns for the vector of the string and hopefully
to find a way to use a pattern from the pattern vector for each value of
the string vector. The only way I can think of is "for" loop, which can be
slow. Also these are happening in a function I am writing. Just wonder if
there is another more efficient way. Thanks a lot.

Jun

On Mon, Sep 5, 2016 at 1:41 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:

> Well, he did provide an example, and...
>
>
> > z <- c('TX.WT.CUT.mean','mg.tx.cv')
>
> > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z)
> [1] "WT.CUT" "tx"
>
>
> ## seems to do what was requested.
>
> Jeff would have to amplify on his initial statement however: do you
> mean that separate patterns can always be combined via "|" ?  Or
> something deeper?
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Sep 4, 2016 at 9:30 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
> > Your opening assertion is false.
> >
> > Provide a reproducible example and someone will demonstrate.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On September 4, 2016 9:06:59 PM PDT, Jun Shen <jun.shen.ut at gmail.com>
> wrote:
> >>Dear list,
> >>
> >>I have a vector of strings that cannot be described by one pattern. So
> >>let's say I construct a vector of patterns in the same length as the
> >>vector
> >>of strings, can I do the element wise pattern recognition and string
> >>substitution.
> >>
> >>For example,
> >>
> >>pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)"
> >>pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)"
> >>
> >>patterns <- c(pattern1,pattern2)
> >>strings <- c('TX.WT.CUT.mean','mg.tx.cv')
> >>
> >>Say I want to extract "WT.CUT" from the first string and "tx" from the
> >>second string. If I do
> >>
> >>sub(patterns, '\\2', strings), only the first pattern will be used.
> >>
> >>looping the patterns doesn't work the way I want. Appreciate any
> >>comments.
> >>Thanks.
> >>
> >>Jun
> >>
> >>       [[alternative HTML version deleted]]
> >>
> >>______________________________________________
> >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list