[R] the hat ^ in regular expression

Duncan Murdoch murdoch at stats.uwo.ca
Mon Feb 8 12:39:20 CET 2010


christophe dutang wrote:
> Dear UseRs,
>
>
> I'm trying to find variable names (string after the "mydata$") in a
> expression. For example,
>
> myexpr <- expression( ( mydata$variable1 / mydata$variable2 ) ^ 2 - 1 + 3 *
> 4 )
>
> I would like to get "variable1" and "variable2". The following few lines
> split the original character string into pieces.
>
> mystring <- as.character(myexpr)
>
> mydatapositions <- gregexpr("mydata", as.character(myexpr))[[1]]
>
> mydataend <- mydatapositions + attr(mydatapositions, "match.length") +1
>
> mydatabegin <- c(mydatapositions, nchar(mystring))
>
> In this loop, I try to remove operator signs, spaces and brackets. But I
> could not match the hat ^ in the string.
>
> for(i in 1:length(mydatapositions ))
> {
>
>     nomydata <- substr(mystring, mydataend[i], mydatabegin[i+1]-1)
>     nomydata <- gsub("\ ","",nomydata)
>     print(nomydata)
>
>     cat("_____\n")
>     res <- gregexpr("[+]",nomydata)
>     print(c(gsub("[+]","",nomydata), unlist(res)))
>
>     res <- gregexpr("[-]",nomydata)
>     print(c(gsub("[-]","",nomydata), unlist(res)))
>
>     res <- gregexpr("[/]",nomydata)
>     print(c(gsub("[/]","",nomydata), unlist(res)))
>
>     res <- gregexpr("[/]",nomydata)
>     print(c(gsub("[*]","",nomydata), unlist(res)))
>
>     res <- gregexpr(")",nomydata)
>     print(c(gsub(")","",nomydata), unlist(res)))
>
>     res <- gregexpr("\^",nomydata)
>     print(c(gsub("\^","",nomydata), unlist(res)))
>
>     print(gsub("[0-9]","",nomydata))
>
>     cat("-------------\n")
> }
>
> I get the following warnings telling me the character is not recognized but
> I don't know how to solve the problem...
>
> Warning messages:
> 1: '\^' is an unrecognized escape in a character string
> 2: unrecognized escape removed from "\^"
> 3: '\^' is an unrecognized escape in a character string
> 4: unrecognized escape removed from "\^"
>
> Any help is welcome.
>   

You need to put a backslash into the pattern.  To do that, you double it 
when you enter the string, e.g.

res <- gregexpr("\\^",nomydata)



For your original problem, you might find it more reliable to look 
through the original expression code.  mydata$variable1 parses to a list
with three elements:  $, mydata, variable1, so a recursive search 
through the expression might be more successful against strangely 
formatted code or extremely long
expressions which may not deparse completely.

Duncan Murdoch



More information about the R-help mailing list