[R] strsplit, keeping delimiters

hadley wickham h.wickham at gmail.com
Sat Jun 14 17:46:10 CEST 2008


On Sat, Jun 14, 2008 at 10:20 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> "hadley wickham" <h.wickham at gmail.com> writes:
> n
>> On Sat, Jun 14, 2008 at 12:55 AM, Gabor Grothendieck
>> <ggrothendieck at gmail.com> wrote:
>>> Try this:
>>>
>>>> library(gsubfn)
>>>> x <- "A: 123 B: 456 C: 678"
>>>> strapply(x, "[^ :]+[ :]|[^ :]+$")
>>> [[1]]
>>> [1] "A:"   "123 " "B:"   "456 " "C:"   "678"
>
> Also
>
>> strsplit(x, "(?<=[0-9:] )", perl=TRUE)
> [[1]]
> [1] "A: "  "123 " "B: "  "456 " "C: "  "678"
>
> which uses perl's zero-length lookbehind to match "" preceed by a
> digit or : and then a space. This is not quite what you asked for

My real example is actually a little more complicated

x <- "AC: 123 BDEF: 456 CADSDFSDFSF: 6sdf:78"

so the look-ahead approach doesn't work (and neither does a
look-behind because it has to be fixed length).

>> I'd like to get
>
>> c("A:", "123 ", "B: ", "456 ", "C: ", 678)
>
> (no space after A:) or what Gabor offered (no spaces after :) but maybe
> what you intended?

Either way is fine, since I'll be stripping off the spaces later anyway.

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list