[R] string split problem
Marc Schwartz
marc_schwartz at me.com
Fri Oct 23 21:39:50 CEST 2015
> On Oct 23, 2015, at 2:17 PM, Jun Shen <jun.shen.ut at gmail.com> wrote:
>
> Dear list,
>
> Say I have a vector that has two different types of string
>
> test <- c('aaa.bb.cc','aaa.dd')
>
> I want to extract the first part of the string (aaa) as a name and save the
> rest of the string as another name.
>
> I was thinking something like
>
> sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want.
>
>
> Appreciate any comments. Thanks.
>
> Jun
How about something like this, which presumes that the characters (besides the periods) are only letters:
> gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)
[1] "aaa|bb.cc" "aaa|dd"
or
> sub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)
[1] "aaa|bb.cc" "aaa|dd"
The above takes the two components, before and after the first '.', adds the "|" as a character in between, to then be used in strsplit():
> strsplit(gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test), split = "\\|")
[[1]]
[1] "aaa" "bb.cc"
[[2]]
[1] "aaa" "dd"
See ?regex
Regards,
Marc Schwartz
More information about the R-help
mailing list