[R] regular expressions, sub

Philippe Grosjean phgrosjean at sciviews.org
Fri Jan 27 12:14:25 CET 2006


Hello,

Here is what I got after playing a little bit with your problem:

# First of all, if you prefer 'ln' instead of 'log', why not to define:
ln <- function(x) log(x)
ln2 <- function(x) log(x)^2
ln3 <- function(x) log(x)^3
ln4 <- function(x) log(x)^4
# ... as many function as powers you need

# Then, your formula is now closer to what you want
# which makes the whole code easier to read for you:

Form <- ln(D) ~ ln(N) + ln2(N) + ln(t) # Same as your original formula

# Here is the function to transform it in a more readable string:
formulaTransform <-
function(form, as.expression = FALSE) {
     if (!inherits(form, "formula"))
         stop("'form' must be a 'formula' object!")
	
     # Transform the formula into a string (is it a better way?)
     Res <- paste(as.character(form)[c(2, 1, 3)], collapse = " ")

     if (as.expression) { # Transform the formula in a nice expression
         # Change '~' into '=='
         Res <- sub("~", "%~~%", Res) # How to do '~' in an expression?
         # Eliminate brackets
         Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " ~ \\1", Res)
         # Transform powers
         Res <- gsub("ln([2-9])", "ln^\\1", Res)
         Res <- eval(parse(text = Res))
     } else { # Make a nicer string
         # Eliminate brackets
         Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " \\1", Res)
         # Transform powers
         Res <- gsub("ln([2-9])", "ln^\\1", Res)
     }

     # Return the result
     return(Res)
}

# Here is a nicer presentation as a string
formulaTransform(Form)

# Here is an even nicer presentation (creating an expression)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form, TRUE))

# The later form is really interesting when you use, for instance,
# greek letters for variables, or so...
Form2 <- ln(alpha) ~ ln(beta) + ln2(beta) + ln3(beta)

formulaTransform(Form2)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form2, TRUE))

# ... but this could be refined even more!

Best,

Philippe Grosjean

..............................................<°}))><........
  ) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
  ) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
( ( ( ( (
..............................................................

Christian Hoffmann wrote:
> Hi,
> 
> I am trying to use sub, regexpr on expressions like
> 
>     log(D) ~ log(N)+I(log(N)^2)+log(t)
> 
> being a model specification.
> 
> The aim is to produce:
> 
>     "ln D ~ ln N + ln^2 N + ln t"
> 
> The variable names N, t may change, the number of terms too.
> 
> I succeded only partially, help on regular expressions is hard to 
> understand for me, examples on my case are rare. The help page on R-help 
> for grep etc. and "regular expressions"
> 
> What I am doing:
> 
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do 
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
> 
> Then I want to use \1 for backreferences something like
> 
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
> 
> to change "log(g)^7" to "ln^7 g",
> 
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
> 
> The special characters are making trouble, sub acceps "(", ")" only in 
> pairs. Code for experimentation:
> 
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in 
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE)) 
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  "," 
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes: 
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3, 
> useBytes=i4)),sep=""));invisible(0) }
> 
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases 
> 5,6,13..16, the rest unsubstituted
> 
> trysub("log(","ln ",fu)           # B: no substitutions; errors for 
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
> 
> trysub("log\(","ln ",fu)          # C: same as A
> 
> trysub("log\\(","ln ",fu)         # D: no substitutions; errors for 
> cases 15,16        # typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x, 
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression 
> 'log\\('\n"
> 
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
> x, ignore.case, useBytes) : \n\tinvalid regular expression 
> 'log\\(([:alpha:]+)\\)'\n"
> 
> 
> 
> Thanks for help
> Christian
> 
> PS. The explanations in the documents




More information about the R-help mailing list