[R] splitting a string into words preserving blanks (using regex)
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Oct 24 16:07:02 CEST 2011
On Mon, Oct 24, 2011 at 9:46 AM, Mark Heckmann <mark.heckmann at gmx.de> wrote:
> I would like to split a string into words at its blanks but also to preserve all blanks.
>
> Example:
> c(" some words to split ")
> should become
> c(" ", "some", " ", " words", " ", "to" , " ", "split", " ")
>
> I was not able to achieve this via strsplit() .
> But I am not familiar with regular expressions.
> Is there an easy way to do that using e.g. regex and strsplit?
Try this:
> library(gsubfn)
> x <- " some words to split "
> v <- strapply(x, "(\\s*)(\\S+)(\\s*)", c)[[1]]
> v[nchar(v) > 0]
[1] " " "some" " " "words" " " "to" " " "split" " "
If you don't need the trailing space it can be further simplified:
> strapply(xx, "(\\s*)(\\S+)", c)[[1]]
[1] " " "some" " " "words" " " "to" " " "split"
or if you don't need the leading space it can be simplified like this:
> strapply(xx, "(\\S+)(\\s*)", c)[[1]]
[1] "some" " " "words" " " "to" " " "split" " "
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list