[R] gsub/strsplit with multiple patterns/splits

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Thu May 31 07:58:39 CEST 2012

There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help page for specific tools. After you have waded through a tutorial, the following explanation should make more sense.

The braces are extended regex syntax for a repetition of a pattern by some minimum to some maximum number of times. The pattern immediately precedes the repetition specification. In the first case of {0,1} the pattern being repeated is the comma, and in the second case it is any of the characters in the square brackets (a period in this case). The period is a special "match any character" pattern when not part of a set of characters. A common shorthand for zero or one of something is a + symbol.

Also, please learn to provide quoting context for the majority of us who do not use Nabble.
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
Sent from my phone. Please excuse my brevity.

mdvaan <mathijsdevaan at gmail.com> wrote:

>Thanks! That works like a charm, but I am not sure if I fully
>understand the
>syntax. I looked at the gsub page but still couldn't figure it out.
>does the pattern part (",{0,1} Inc[.]{0,1}") do? What do the 0 and 1
>the curly brackets refer to? Also, what if, for example, I would want
>remove the word "Energy"?
>Thank you very much in advance.
>View this message in context:
>Sent from the R help mailing list archive at Nabble.com.
>R-help at r-project.org mailing list
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list