[R] regex question

Peter Dalgaard p.dalgaard at biostat.ku.dk
Tue Nov 4 09:28:21 CET 2008


markleeds at verizon.net wrote:
> Hi: Gabor's solution does do it in a single line. he just used paste to 
> make the line. see below. John's is sort of a single line also but he 
> called sub twice.
> I doubt that it's possible to make  it shorter than those solutions.

Well, you can lose the parentheses and one space character, that'll be 
shorter ;-) :

 > rr <- "^[ <*]+|[ >]+$"
 > gsub(rr,"",x)
[1] "this is my text"

There are also solutions involving replacement patterns, but they become 
a bit painful:

 > r <- "^[ <*]+(.*[[:alpha:]])[ > ]+$"
 > sub(r,"\\1",x)
[1] "this is my text"

Trouble being, you want the subexpressions at either end to be "greedy", 
but not the middle one.

> # Gabor's solution spelled  out.
> 
> patReg1 <- "(^[ <*]+)"
> patReg2 <- "([ > ]+$)"
> temp <- paste(patReg1, patReg2, sep = "|")
> print(temp)
> 
> gsub(temp, "", varReg)
> 

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list