[R] regular expressions, sub

Christian Hoffmann christian.hoffmann at wsl.ch
Fri Jan 27 10:54:11 CET 2006


Hi,

I am trying to use sub, regexpr on expressions like

    log(D) ~ log(N)+I(log(N)^2)+log(t)

being a model specification.

The aim is to produce:

    "ln D ~ ln N + ln^2 N + ln t"

The variable names N, t may change, the number of terms too.

I succeded only partially, help on regular expressions is hard to 
understand for me, examples on my case are rare. The help page on R-help 
for grep etc. and "regular expressions"

What I am doing:

(f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
(ft <- sub("","",f))   # creates string with parts of formula, how to do 
it simpler?
(fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string

Then I want to use \1 for backreferences something like

(fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))

to change "log(g)^7" to "ln^7 g",

and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx

The special characters are making trouble, sub acceps "(", ")" only in 
pairs. Code for experimentation:

trysub <- function(s,t,e) {
ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in 
c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE)) 
print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  "," 
~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes: 
", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3, 
useBytes=i4)),sep=""));invisible(0) }

trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases 
5,6,13..16, the rest unsubstituted

trysub("log(","ln ",fu)           # B: no substitutions; errors for 
cases 1..4,7.. 12   # typical errors:
"3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"

trysub("log\(","ln ",fu)          # C: same as A

trysub("log\\(","ln ",fu)         # D: no substitutions; errors for 
cases 15,16        # typical errors:
"15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x, 
ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression 
'log\\('\n"

trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
# E: typical errors:
"3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
x, ignore.case, useBytes) : \n\tinvalid regular expression 
'log\\(([:alpha:]+)\\)'\n"



Thanks for help
Christian

PS. The explanations in the documents
-- 
Dr. Christian W. Hoffmann,
Swiss Federal Research Institute WSL
Mathematics + Statistical Computing
Zuercherstrasse 111
CH-8903 Birmensdorf, Switzerland

Tel +41-44-7392-277  (office)   -111(exchange)
Fax +41-44-7392-215  (fax)
christian.hoffmann at wsl.ch
http://www.wsl.ch/staff/christian.hoffmann

International Conference 5.-7.6.2006 Ekaterinburg Russia
"Climate changes and their impact on boreal and temperate forests"
http://ecoinf.uran.ru/conference/




More information about the R-help mailing list