[R] Regular expression \ String Extraction help
(Ted Harding)
Ted.Harding at manchester.ac.uk
Wed Jun 3 15:55:31 CEST 2009
On 03-Jun-09 11:34:16, Tony Breyal wrote:
> Dear all,
> Is there a good way of doing the following conversion:
>
> [YYYY]-[MM]-[DD] [Time] [Day] [Name][Integer].[Extention]
>
> to become
>
> C:\test\[Name]\[YYYY]-[MM]-[DD] [Time] [Day]\[YYYY]-[MM]-[DD] [Time]
> [Day] [Name][Integer].[Extention]
>
> i.e. these
>
> 2009-04-10 1400 Fri Foo1.txt
> 2009-04-10 1400 Fri Universities2.txt
> 2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt
>
> will become
>
> C:\test\Foo\2009-04-10 1400 Fri Foo1.txt
> C:\test\Universities\2009-04-10 1400 Fri Universities2.txt
> C:\test\Hitchhikers Guide To The Galaxy\2009-04-10 1400 Fri
> Hitchhikers Guide To The Galaxy42.txt
>
> My main issue is the conversion for 'Hitchkikers Guide To The
> Galaxy54' because of the spaces in the Name. So far this is what i
> have:
>
>> txt <- '2009-04-10 1400 Fri Universities1.txt'
>> step1 <- unlist(strsplit(txt, '\\.'))
>> step2 <- unlist(strsplit(step1[1], ' '))
>> Name <- gsub('[0-9]',replacement='', step2[4])
>> step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
>> paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
> [1] "C:\\test\\Universities\\2009-04-10 1400 Fri\\2009-04-10 1400 Fri
> Universities1.txt"
>
> Cheers,
> Tony Breyal
I can get as far as the following (using the "Hitchhikers" one):
txt <- '2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt'
step1 <- unlist(strsplit(txt, '\\.'))
step2 <- unlist(strsplit(step1[1], ' '))
step2
# [1] "Hitchhikers" "Guide" "To" "The" "Galaxy42"
What is now needed is to join all the separate elements of step2
into a single character string. paste() won't do it, because it
produces a separate character string for each element of step2.
cat() won't do it because it has no value (so cannot be assigned).
You could loop over step2:
Name1 <- step2[4]
for(i in (5:length(step2))) Name1 <- paste(Name1,step2[i])
Name1
# [1] "Hitchhikers Guide To The Galaxy42"
Then do your gsub:
Name <- gsub('[0-9]',replacement='', Name1)
step3 <- paste(step2[1], step2[2], step2[3], sep=' ')
paste('C:\\test\\', Name, '\\', step3, '\\', txt, sep='' )
[1] "C:\\test\\Hitchhikers Guide To The Galaxy\\2009-04-10 1400
Fri\\2009-04-10 1400 Fri Hitchhikers Guide To The Galaxy42.txt"
So that works; but it would be nice to be able to avoid the loop!
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Jun-09 Time: 14:55:28
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list