[R] gsub() on Matrix

Tony Plate tplate at acm.org
Thu Oct 28 17:55:25 CEST 2004

Many more recent regular expression implementations have ways of indicating 
a match on a word boundary.  It's usually "\b".

Here's what you did:

 > gsub("x1", "i1", "x1 + x2 + x10 + xx1")
[1] "i1 + x2 + i10 + xi1"

The following worked for me to just change "x1" to "i1", while leaving 
alone any larger "word" that contains "x1":

 > gsub("\\bx1\\b", "i1", "x1 + x2 + x10 + xx1")
[1] "i1 + x2 + x10 + xx1"

Note that the backslash must be escaped itself to get past the R lexical 
analyser, which is independent of the regexp processor.  What the regexp 
processor sees is just a single backslash.

For more on this, look for perl documentation of regular expressions.  Be 
aware that to use full perl regexps, you must supply the perl=T argument to 
gsub().  Also note that "\b" seems to be part of the most basic regular 
expression language in R; it even works with extended=F:

 > gsub("\\bx1\\b", "i1", "x1 + x2 + x10 + xx1", perl=T)
[1] "i1 + x2 + x10 + xx1"
 > gsub("\\bx1\\b", "i1", "x1 + x2 + x10 + xx1", perl=F)
[1] "i1 + x2 + x10 + xx1"
 > gsub("\\bx1\\b", "i1", "x1 + x2 + x10 + xx1", perl=F, ext=F)
[1] "i1 + x2 + x10 + xx1"

(I assumed the fact that you have a matrix of strings is not relevant.)

Hope this helps,

Tony Plate

At Wednesday 09:07 PM 10/27/2004, Kevin Wang wrote:
>Suppose I've got a matrix, and the first few elements look like
>   "x1 + x3 + x4 + x5 + x1:x3 + x1:x4"
>   "x1 + x2 + x3 + x5 + x1:x2 + x1:x5"
>   "x1 + x3 + x4 + x5 + x1:x3 + x1:x5"
>and so on (have got terms from x1 ~ x14).
>If I want to replace all the x1 with i7, all x2 with i14, all x3 with i13,
>for example.  Is there an easy way?
>I tried to put what I want to replace in a vector, like:
>  repl = c("i7", "i14", "i13", "d2", "i8", "i5",
>           "i6", "i3", "A", "i9", "i2",
>           "i4", "i15", "i21")
>and have another vector, say:
>   > orig
>  [1] "x1"  "x2"  "x3"  "x4"  "x5"  "x6"  "x7"  "x8"  "x9"  "x10"
>[11] "x11" "x12" "x13" "x14"
>Then I tried something like
>   gsub(orig, repl, mat)
>## mat is the name of my matrix
>but it didn't work *_*.....it would replace terms like x10 with i70.
>(I know it may be an easy question...but I haven't done much regular
>Ko-Kang Kevin Wang
>PhD Student
>Centre for Mathematics and its Applications
>Building 27, Room 1004
>Mathematical Sciences Institute (MSI)
>Australian National University
>Canberra, ACT 0200
>Homepage: http://wwwmaths.anu.edu.au/~wangk/
>Ph (W): +61-2-6125-2431
>Ph (H): +61-2-6125-7407
>Ph (M): +61-40-451-8301
>R-help at stat.math.ethz.ch mailing list
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

More information about the R-help mailing list