[R] Split a string vector with '[ ]'
arun
smartpink111 at yahoo.com
Mon Jun 9 13:05:55 CEST 2014
Hi Alexsandro,
Suppose if you have strings
nw.str1 <- "[D][A|D]A:F[T|A:D]N[C|T]"
nw.str2 <- "[D][A|D]A[T|A:D][C|T]NA{DG]P"
you could use:
library(qdap)
as.vector(bracketXtract(nw.str1,"square",T))
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
as.vector(bracketXtract(nw.str2,"square",T))
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
#or
regmatches(nw.str1, gregexpr("(\\[).*?(\\])", nw.str1))[[1]]
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
regmatches(nw.str2, gregexpr("(\\[).*?(\\])", nw.str2))[[1]]
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
#or modifying David's and Duncan's codes for the first case:
scan(what="",text=gsub("\\].*?\\[","] [", nw.str1))
#Read 4 items
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
readLines(textConnection(gsub("\\].*?\\[", "]\n[", nw.str1)))
#[1] "[D]" "[A|D]" "[T|A:D]" "[C|T]"
##I couldn't get it right with ?gsub() for the second case.
A.K.
On Sunday, June 8, 2014 4:57 PM, David Winsemius <dwinsemius at comcast.net> wrote:
On Jun 8, 2014, at 1:46 PM, Duncan Murdoch wrote:
> On 08/06/2014, 4:30 PM, Alexsandro Cândido de Oliveira Silva wrote:
>> Hi,
>>
>> I have a string something like that:
>>
>> nw.str <- "[D][A|D][T|A:D][C|T]"
>>
>> And I need to split it in this way:
>>
>> "[D]" "[A|D]" "[T|A:D]" "[C|T]"
>
> You could probably use lookahead and lookbehind Perl regular
> expressions, but this might be easier:
>
> readLines(textConnection(gsub("\\]\\[", "]\n[", nw.str)))
>
> This just inserts a newline between each pair of brackets, and then
> reads the resulting string.
Same idea with scan() using space as separator:
scan(what="", text=gsub("\\]\\[", "\\] \\[", nw.str))
--
David Winsemius
Alameda, CA, USA
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list