[R] gsub: replacing a.*a if no occurence of b in .*

Gabor Grothendieck ggrothendieck at gmail.com
Sat Feb 24 17:37:32 CET 2007


The _question_ assumed that, which is why the answers did too.

On 2/24/07, Charilaos Skiadas <skiadas at hanover.edu> wrote:
> All these methods do assume that you don't have nested <tag>'s, like so:
>
> <tag><tag>foo</tag>useful stuff</tag>some garbage</tag>
>
> For that you would really need a true parser. So I would double-check
> to make sure this doesn't happen.
>
> Do you have any control on where those XML files are generated
> though? It sounds to me it might be easier to fix the utility
> generating those XML files, since it clearly is doing something wrong.
>
> On Feb 24, 2007, at 11:07 AM, Gabor Grothendieck wrote:
>
> > I assume <tag> is known.
> >
> > This removes any occurrence </tag>.*</tag> where .* does not
> > contain <tag> or </tag>.
> >
> > The regular expression, re, matches </tag>, then does a greedy
> > match (?U) for anything followed by </tag> but uses a zero
> > width lookahead subexpression (?=...) for the second </tag>
> > so that it it can be rematched again.  gsubfn in package
> > gsubfn is like the usual gsub except that instead of
> > replacing the match with a string it passes the match
> > to function f and then replaces the match with the output
> > of f.  See the gsubfn home page:
> >   http://code.google.com/p/gsubfn/
> > and vignette.
>
> Haris Skiadas
> Department of Mathematics and Computer Science
> Hanover College
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list