[R] Splitting a character vector.

John Kane jrkrideau at inbox.com
Mon Jul 9 15:32:45 CEST 2012


Right, I see it now. Thanks.  

Who knows in another 100 years I may understand regex.

John Kane
Kingston ON Canada


> -----Original Message-----
> From: smartpink111 at yahoo.com
> Sent: Sat, 7 Jul 2012 16:19:54 -0700 (PDT)
> To: jrkrideau at inbox.com
> Subject: Re: [R] Splitting a character vector.
> 
> 
> 
> HI John,
> If I understand the post, in your original data, there is a space between
> XXY and (.
> If there was no space,
> 
> dd1  <-  c( "XXY(mat harry)","XXY(jim bob)", "CAMP(joe blow)", "ALP(max
> jack)")
> 
> #Rui's original code will produce
> result<-strsplit(sub(close.par,"",dd1),open.par)
>> result
> [[1]]
> [1] "XXY(mat harry"
> 
> [[2]]
> [1] "XXY(jim bob"
> 
> [[3]]
> [1] "CAMP(joe blow"
> 
> [[4]]
> [1] "ALP(max jack"
> 
> 
> #But, if I wanted to get the result as in the original data strsplit,
> result<-strsplit(sub(close.par," ",dd1),open.par)
>> result
> [[1]]
> [1] "XXY"        "mat harry "
> 
> [[2]]
> [1] "XXY"      "jim bob "
> 
> [[3]]
> [1] "CAMP"      "joe blow "
> 
> [[4]]
> [1] "ALP"       "max jack "
> 
> 
> A.K.
> 
> 
> 
> 
> 
> 
> ----- Original Message -----
> From: John Kane <jrkrideau at inbox.com>
> To: Rui Barradas <ruipbarradas at sapo.pt>
> Cc: r-help at r-project.org
> Sent: Saturday, July 7, 2012 6:33 PM
> Subject: Re: [R] Splitting a character vector.
> 
> I think I'm geting it a bit. Anyway time to shut down and have a beer.
> Life will be much nice tomorrow or Monday when I get back to cleaning up
> the data from that spreadsheet.
> 
> Many thanks and have a good weekend.
> 
> John Kane
> Kingston ON Canada
> 
> 
>> -----Original Message-----
>> From: ruipbarradas at sapo.pt
>> Sent: Sat, 07 Jul 2012 23:28:26 +0100
>> To: jrkrideau at inbox.com
>> Subject: Re: [R] Splitting a character vector.
>> 
>> The space is for a different reason, strsplit doesn't put the split
>> pattern in the result, so if a space is included it will be
>> automatically deleted. For instance in "XXY (mat harry)" without the
>> space it would become "XXY " and "mat harry)" but we want "XXY" so
>> include the space in the pattern.
>> 
>> Another example, this one artificial:
>> 
>> "123AB456" ---> "123" and "456"
>> 
>> strsplit("123AB456", "B") ---> "123A" and "456"
>> 
>> So include the "A" in the pattern. It's _exactly_ the same thing.
>> 
>> Rui Barradas
>> 
>> Em 07-07-2012 23:21, John Kane escreveu:
>>> Ah, I think Mark may have it.  See my earlier post.  Why the space?
>>> 
>>> John Kane
>>> Kingston ON Canada
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: ruipbarradas at sapo.pt
>>>> Sent: Sat, 07 Jul 2012 23:12:46 +0100
>>>> To: markleeds2 at gmail.com
>>>> Subject: Re: [R] Splitting a character vector.
>>>> 
>>>> Oh, right!
>>>> 
>>>> The close parenthesis isn't doing nothing in the result, t could be
>>>> done
>>>> after but since we're to it...
>>>> 
>>>> Rui Barradas
>>>> 
>>>> Em 07-07-2012 23:10, Mark Leeds escreveu:
>>>>> Hi Rui: I think he's asking about your replacement with blanks.
>>>>> 
>>>>> 
>>>>> On Sat, Jul 7, 2012 at 6:08 PM, Rui Barradas <ruipbarradas at sapo.pt
>>>>> <mailto:ruipbarradas at sapo.pt>> wrote:
>>>>> 
> >>>>      Hello,
>>>>> 
> >>>>      Sorry, but I don't understand, you're asking about 4 single
>>>>> quotes,
> >>>>      the double quotes in open.par are just opening and closing the
> >>>>      pattern, a character string.
>>>>> 
> >>>>      Rui Barradas
>>>>> 
> >>>>      Em 07-07-2012 23:03, John Kane escreveu:
>>>>> 
> >>>>          Thanks Rui
> >>>>          It works perfectly so far on the test and real data.
>>>>> 
> >>>>          The annoying thing is that I had tried , or thought I'd
> tried
> >>>>          the open.par format and keep getting an error.
>>>>> 
> >>>>             It looks like I had failed to add the '''',  in the
> term.
> >>>>          What is it doing?
>>>>> 
>>>>> 
>>>>> 
> >>>>          John Kane
> >>>>          Kingston ON Canada
>>>>> 
>>>>> 
> >>>>              -----Original Message-----
> >>>>              From: ruipbarradas at sapo.pt
> <mailto:ruipbarradas at sapo.pt>
> >>>>              Sent: Sat, 07 Jul 2012 22:55:41 +0100
> >>>>              To: jrkrideau at inbox.com <mailto:jrkrideau at inbox.com>
> >>>>              Subject: Re: [R] Splitting a character vector.
>>>>> 
> >>>>              Hello,
>>>>> 
> >>>>              Try the following.
>>>>> 
> >>>>              open.par <- " \\("  # with a blank before '('
> >>>>              close.par <- "\\)"
> >>>>              result <- strsplit(sub(close.par, "", dd1), open.par)
>>>>> 
>>>>> 
> >>>>              Why the two '\\'? Because '(' is a meta-character so it
>>>>> must
> >>>>              be escaped.
> >>>>              But '\' is a meta character so it must also be escaped.
>>>>> 
> >>>>              Then choose the right way to separate the two, maybe
> >>>>              something like
>>>>> 
> >>>>              ix <- rep(c(TRUE, FALSE), length(result))
> >>>>              unlist(result)[ix]
> >>>>              unlist(result)[!ix]
>>>>> 
>>>>> 
> >>>>              Hope this helps,
>>>>> 
> >>>>              Rui Barradas
>>>>> 
> >>>>              Em 07-07-2012 22:37, John Kane escreveu:
>>>>> 
> >>>>                  I am lousy at simple regex and I have not found a
> >>>>                  solution to a simple
> >>>>                  problem.
>>>>> 
> >>>>                  I have a vector with some character values that I
>>>>> want
> >>>>                  to split.
> >>>>                  Sample data
> >>>>                  dd1  <-  c( "XXY (mat harry)","XXY (jim bob)",
> "CAMP
> >>>>                  (joe blow)", "ALP
> >>>>                  (max jack)")
>>>>> 
> >>>>                  Desired result
> >>>>                  dd2  <-  data.frame( xx = c("XXY", "XXY", "CAMP",
> >>>>                  "ALP"), yy = c("mat
> >>>>                  harry", "jim bob" , "joe blow", "max jack"))
>>>>> 
> >>>>                  I thought I should be able to split the characters
>>>>> with
> >>>>                  strsplit but
> >>>>                  either I am misunderstanding the function or don't
>>>>> know
> >>>>                  how to escape a
> >>>>                  "(" properly in an effort to at least get   "XXY"
>>>>> "(mat
> >>>>                  harry)"
>>>>> 
> >>>>                  Any pointers would be appreciated
> >>>>                  Thanks
> >>>>                  John Kane
> >>>>                  Kingston ON Canada
>>>>> 
>>>>> 
>>>>> ______________________________________________________________
> >>>>                  FREE 3D MARINE AQUARIUM SCREENSAVER - Watch
> dolphins,
> >>>>                  sharks & orcas on
> >>>>                  your desktop!
>>>>> 
> >>>>                  ________________________________________________
> >>>>                  R-help at r-project.org <mailto:R-help at r-project.org>
> >>>>                  mailing list
> >>>>                  https://stat.ethz.ch/mailman/__listinfo/r-help
> >>>>                  <https://stat.ethz.ch/mailman/listinfo/r-help>
> >>>>                  PLEASE do read the posting guide
> >>>>                  http://www.R-project.org/__posting-guide.html
> >>>>                  <http://www.R-project.org/posting-guide.html>
> >>>>                  and provide commented, minimal, self-contained,
> >>>>                  reproducible code.
>>>>> 
>>>>> 
>>>>> 
>>>>> ______________________________________________________________
> >>>>          FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins,
> sharks
>>>>> &
> >>>>          orcas on your desktop!
> >>>>          Check it out at http://www.inbox.com/__marineaquarium
> >>>>          <http://www.inbox.com/marineaquarium>
>>>>> 
>>>>> 
>>>>> 
> >>>>      ________________________________________________
> >>>>      R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> >>>>      https://stat.ethz.ch/mailman/__listinfo/r-help
> >>>>      <https://stat.ethz.ch/mailman/listinfo/r-help>
> >>>>      PLEASE do read the posting guide
> >>>>      http://www.R-project.org/__posting-guide.html
> >>>>      <http://www.R-project.org/posting-guide.html>
> >>>>      and provide commented, minimal, self-contained, reproducible
>>>>> code.
>>>>> 
>>>>> 
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> ____________________________________________________________
>>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends
>>> and family!
>>> Visit http://www.inbox.com/photosharing to find out more!
>>> 
>>> 
>> 
> 
> ____________________________________________________________
> GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at
> http://www.inbox.com/smileys
> Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™
> and most webmails
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

____________________________________________________________
FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!



More information about the R-help mailing list