[R] Regular expression \ String Extraction help

Greg Snow Greg.Snow at imail.org
Wed Jun 3 17:20:15 CEST 2009


Ted,

Try using paste with the collapse argument.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Ted Harding
> Sent: Wednesday, June 03, 2009 7:56 AM
> To: Tony Breyal; r-help at r-project.org
> Subject: Re: [R] Regular expression \ String Extraction help
> 
> On 03-Jun-09 11:34:16, Tony Breyal wrote:
> > Dear all,
> > Is there a good way of doing the following conversion:
> >
> > [YYYY]-[MM]-[DD] [Time] [Day] [Name][Integer].[Extention]
> >
> > to become
> >
> > C:\test\[Name]\[YYYY]-[MM]-[DD] [Time] [Day]\[YYYY]-[MM]-[DD] [Time]
> > [Day] [Name][Integer].[Extention]
> >
> > i.e. these
> >
> > 2009-04-10 1400 Fri Foo1.txt
> > 2009-04-10 1400 Fri Universities2.txt
> > 2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt
> >
> > will become
> >
> > C:\test\Foo\2009-04-10 1400 Fri Foo1.txt
> > C:\test\Universities\2009-04-10 1400 Fri Universities2.txt
> > C:\test\Hitchhikers Guide To The Galaxy\2009-04-10 1400 Fri
> > Hitchhikers Guide To The Galaxy42.txt
> >
> > My main issue is the conversion for 'Hitchkikers Guide To The
> > Galaxy54' because of the spaces in the Name. So far this is what i
> > have:
> >
> >> txt <- '2009-04-10 1400 Fri Universities1.txt'
> >> step1 <- unlist(strsplit(txt, '\\.'))
> >> step2 <- unlist(strsplit(step1[1], ' '))
> >> Name <- gsub('[0-9]',replacement='', step2[4])
> >> step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
> >> paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
> > [1] "C:\\test\\Universities\\2009-04-10 1400 Fri\\2009-04-10 1400 Fri
> > Universities1.txt"
> >
> > Cheers,
> > Tony Breyal
> 
> I can get as far as the following (using the "Hitchhikers" one):
> 
>    txt <- '2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt'
>    step1 <- unlist(strsplit(txt, '\\.'))
>    step2 <- unlist(strsplit(step1[1], ' '))
>    step2
>    # [1] "Hitchhikers" "Guide" "To" "The" "Galaxy42"
> 
> What is now needed is to join all the separate elements of step2
> into a single character string. paste() won't do it, because it
> produces a separate character string for each element of step2.
> cat() won't do it because it has no value (so cannot be assigned).
> 
> You could loop over step2:
> 
>    Name1 <- step2[4]
>    for(i in (5:length(step2))) Name1 <- paste(Name1,step2[i])
>    Name1
>    # [1] "Hitchhikers Guide To The Galaxy42"
> 
> Then do your gsub:
> 
>    Name <- gsub('[0-9]',replacement='', Name1)
>    step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
>    paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
> 
> [1] "C:\\test\\Hitchhikers Guide To The Galaxy\\2009-04-10 1400
> Fri\\2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt"
> 
> So that works; but it would be nice to be able to avoid the loop!
> 
> Ted.
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 03-Jun-09                                       Time: 14:55:28
> ------------------------------ XFMail ------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list