[R] Splitting a character vector.

David Winsemius dwinsemius at comcast.net
Sun Jul 8 03:45:58 CEST 2012

On Jul 7, 2012, at 5:37 PM, John Kane wrote:

> I am lousy at simple regex and I have not found a solution to a  
> simple problem.
> I have a vector with some character values that I want to split.
> Sample data
> dd1  <-  c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)",  
> "ALP (max jack)")
> Desired result
> dd2  <-  data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy =  
> c("mat harry", "jim bob" , "joe blow", "max jack"))

data.frame(xx=sub("(\\s\\(.+$)", "", dd1),
            yy=sub("(.+)(\\s\\()(.+)(\\)$)", "\\3", dd1) )
     xx        yy
1  XXY mat harry
2  XXY   jim bob
3 CAMP  joe blow
4  ALP  max jack

> I thought I should be able to split the characters with strsplit but  
> either I am misunderstanding the function or don't know how to  
> escape a "(" properly in an effort to at least get   "XXY" "(mat  
> harry)"

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

More information about the R-help mailing list