[R] String manipulation

jim holtman jholtman at gmail.com
Sun Feb 13 20:07:15 CET 2011


If you have an indeterminate number of the patterns in the string, try
the following:

> MyString <- "ABCFR34564IJVEOJC3434"
> # translate to the pattern sequences
> x <- chartr('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
+           , '000000000000000000000000001111111111'
+           , MyString
+           )
> x.rle <- rle(strsplit(x, '')[[1]])  # determine the runs
> # create extraction matrix
> x.ext <- cbind(cumsum(c(1, head(x.rle$lengths, -1)))
+                     , cumsum(x.rle$lengths)
+                     )
> substring(MyString, x.ext[,1], x.ext[,2])
[1] "ABCFR"   "34564"   "IJVEOJC" "3434"
>


On Sun, Feb 13, 2011 at 10:27 AM, Megh Dal <megh700004 at gmail.com> wrote:
> Please consider following string:
>
> MyString <- "ABCFR34564IJVEOJC3434"
>
> Here you see that, there are 4 groups in above string. 1st and 3rd groups
> are for english letters and 2nd and 4th for numeric. Given a string, how can
> I separate out those 4 groups?
>
> Thanks for your time
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list