# [R] regular expressions, sub

Philippe Grosjean phgrosjean at sciviews.org
Fri Jan 27 12:14:25 CET 2006

```Hello,

Here is what I got after playing a little bit with your problem:

# First of all, if you prefer 'ln' instead of 'log', why not to define:
ln <- function(x) log(x)
ln2 <- function(x) log(x)^2
ln3 <- function(x) log(x)^3
ln4 <- function(x) log(x)^4
# ... as many function as powers you need

# Then, your formula is now closer to what you want
# which makes the whole code easier to read for you:

Form <- ln(D) ~ ln(N) + ln2(N) + ln(t) # Same as your original formula

# Here is the function to transform it in a more readable string:
formulaTransform <-
function(form, as.expression = FALSE) {
if (!inherits(form, "formula"))
stop("'form' must be a 'formula' object!")

# Transform the formula into a string (is it a better way?)
Res <- paste(as.character(form)[c(2, 1, 3)], collapse = " ")

if (as.expression) { # Transform the formula in a nice expression
# Change '~' into '=='
Res <- sub("~", "%~~%", Res) # How to do '~' in an expression?
# Eliminate brackets
Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " ~ \\1", Res)
# Transform powers
Res <- gsub("ln([2-9])", "ln^\\1", Res)
Res <- eval(parse(text = Res))
} else { # Make a nicer string
# Eliminate brackets
Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " \\1", Res)
# Transform powers
Res <- gsub("ln([2-9])", "ln^\\1", Res)
}

# Return the result
return(Res)
}

# Here is a nicer presentation as a string
formulaTransform(Form)

# Here is an even nicer presentation (creating an expression)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form, TRUE))

# The later form is really interesting when you use, for instance,
# greek letters for variables, or so...
Form2 <- ln(alpha) ~ ln(beta) + ln2(beta) + ln3(beta)

formulaTransform(Form2)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form2, TRUE))

# ... but this could be refined even more!

Best,

Philippe Grosjean

..............................................<°}))><........
) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
( ( ( ( (
..............................................................

Christian Hoffmann wrote:
> Hi,
>
> I am trying to use sub, regexpr on expressions like
>
>     log(D) ~ log(N)+I(log(N)^2)+log(t)
>
> being a model specification.
>
> The aim is to produce:
>
>     "ln D ~ ln N + ln^2 N + ln t"
>
> The variable names N, t may change, the number of terms too.
>
> I succeded only partially, help on regular expressions is hard to
> understand for me, examples on my case are rare. The help page on R-help
> for grep etc. and "regular expressions"
>
> What I am doing:
>
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
>
> Then I want to use \1 for backreferences something like
>
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
>
> to change "log(g)^7" to "ln^7 g",
>
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
>
> The special characters are making trouble, sub acceps "(", ")" only in
> pairs. Code for experimentation:
>
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE))
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  ","
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes:
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3,
> useBytes=i4)),sep=""));invisible(0) }
>
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases
> 5,6,13..16, the rest unsubstituted
>
> trysub("log(","ln ",fu)           # B: no substitutions; errors for
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
>
> trysub("log\(","ln ",fu)          # C: same as A
>
> trysub("log\\(","ln ",fu)         # D: no substitutions; errors for
> cases 15,16        # typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x,
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression
> 'log\\('\n"
>
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression
> 'log\\(([:alpha:]+)\\)'\n"
>
>
>
> Thanks for help
> Christian
>
> PS. The explanations in the documents

```