[R] Regular expression \ String Extraction help

Tony Breyal tony.breyal at googlemail.com
Fri Jun 5 12:34:12 CEST 2009


Thanks guys, that's information if very much appreciated. Both
solution are better than mine which was to use capture.output after
using the cat function.

re: collapse argument in paste() -- I've always wondered what that
argument was for, i thought it was basically doing what the sep
argument does, but i can now see that it can do something else that is
very useful which sep does not do

> txt <- c('Doctor', 'Who', 'For', 'The', 'Win!')
> paste(txt, sep=' ')
[1] "Doctor" "Who"    "For"    "The"    "Win!"
> paste(txt, collapse=' ')
[1] "Doctor Who For The Win!"

re: regulare expression by Gabor - mate, that line of code is a thing
of beauty, i can honestly say i would never have come up with that.

Cheers,
Tony

2009/6/3 Greg Snow <Greg.Snow at imail.org>:
> Ted,
>
> Try using paste with the collapse argument.
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow at imail.org
> 801.408.8111
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Ted Harding
>> Sent: Wednesday, June 03, 2009 7:56 AM
>> To: Tony Breyal; r-help at r-project.org
>> Subject: Re: [R] Regular expression \ String Extraction help
>>
>> On 03-Jun-09 11:34:16, Tony Breyal wrote:
>> > Dear all,
>> > Is there a good way of doing the following conversion:
>> >
>> > [YYYY]-[MM]-[DD] [Time] [Day] [Name][Integer].[Extention]
>> >
>> > to become
>> >
>> > C:\test\[Name]\[YYYY]-[MM]-[DD] [Time] [Day]\[YYYY]-[MM]-[DD] [Time]
>> > [Day] [Name][Integer].[Extention]
>> >
>> > i.e. these
>> >
>> > 2009-04-10 1400 Fri Foo1.txt
>> > 2009-04-10 1400 Fri Universities2.txt
>> > 2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt
>> >
>> > will become
>> >
>> > C:\test\Foo\2009-04-10 1400 Fri Foo1.txt
>> > C:\test\Universities\2009-04-10 1400 Fri Universities2.txt
>> > C:\test\Hitchhikers Guide To The Galaxy\2009-04-10 1400 Fri
>> > Hitchhikers Guide To The Galaxy42.txt
>> >
>> > My main issue is the conversion for 'Hitchkikers Guide To The
>> > Galaxy54' because of the spaces in the Name. So far this is what i
>> > have:
>> >
>> >> txt <- '2009-04-10 1400 Fri Universities1.txt'
>> >> step1 <- unlist(strsplit(txt, '\\.'))
>> >> step2 <- unlist(strsplit(step1[1], ' '))
>> >> Name <- gsub('[0-9]',replacement='', step2[4])
>> >> step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
>> >> paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
>> > [1] "C:\\test\\Universities\\2009-04-10 1400 Fri\\2009-04-10 1400 Fri
>> > Universities1.txt"
>> >
>> > Cheers,
>> > Tony Breyal
>>
>> I can get as far as the following (using the "Hitchhikers" one):
>>
>>    txt <- '2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt'
>>    step1 <- unlist(strsplit(txt, '\\.'))
>>    step2 <- unlist(strsplit(step1[1], ' '))
>>    step2
>>    # [1] "Hitchhikers" "Guide" "To" "The" "Galaxy42"
>>
>> What is now needed is to join all the separate elements of step2
>> into a single character string. paste() won't do it, because it
>> produces a separate character string for each element of step2.
>> cat() won't do it because it has no value (so cannot be assigned).
>>
>> You could loop over step2:
>>
>>    Name1 <- step2[4]
>>    for(i in (5:length(step2))) Name1 <- paste(Name1,step2[i])
>>    Name1
>>    # [1] "Hitchhikers Guide To The Galaxy42"
>>
>> Then do your gsub:
>>
>>    Name <- gsub('[0-9]',replacement='', Name1)
>>    step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
>>    paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
>>
>> [1] "C:\\test\\Hitchhikers Guide To The Galaxy\\2009-04-10 1400
>> Fri\\2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt"
>>
>> So that works; but it would be nice to be able to avoid the loop!
>>
>> Ted.
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 03-Jun-09                                       Time: 14:55:28
>> ------------------------------ XFMail ------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list