[Rd] splitting strings efficiently

Gabor Grothendieck ggrothendieck at gmail.com
Wed Sep 24 18:47:02 CEST 2008


Also one can create a text connection and read it using read.table, scan, etc.

s <- c("12;13;14", "15;16;17")

read.table(textConnection(s), sep = ";")
# or
scan(textConnection(s), sep = ";")


On Wed, Sep 24, 2008 at 12:20 PM, Mark Kimpel <mwkimpel at gmail.com> wrote:
> I knew there HAD to be a basic function, but 'help.search("split string")'
> and 'help("string") did not find it. Thanks for the help on this elementary
> question.
> Mark
> ------------------------------------------------------------
> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN 46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 399-1219 Home
> Skype: mkimpel
>
> ******************************************************************
>
>
> On Wed, Sep 24, 2008 at 12:17 PM, Erik Iverson <iverson at biostat.wisc.edu>wrote:
>
>> ?strsplit
>>
>> Mark Kimpel wrote:
>>
>>> I have a very long list of strings. Each string actually contains multiple
>>> values separated by a semi-colon. I need to turn each string into a vector
>>> of the values delimited by the semi-colons. I know I can do this very
>>> laboriously by using loops, nchar, and substr, but it is terribly slow. Is
>>> there a basic R function that handles this situation? If not, is there
>>> perhaps a faster way to do it than I currently am, which is to lapply the
>>> following function? Thanks, Mark
>>>
>>>
>>> #######################################################################################
>>> string.tokenizer.func<-function(string, separator){
>>>  new.vec<- NULL
>>>  newString<- ""
>>>  if(is.null(string)) {new.vec<-""} else {
>>>    for(i in 1:(nchar(string) + 1)){
>>>      if(substr(string, i, i) == separator){
>>>        new.vec<-c(new.vec,newString)
>>>        newString <- ""
>>>      } else {
>>>        newString<-paste(newString, substr(string, i, i), sep="")
>>>      }
>>>    }
>>>    new.vec<-c(new.vec,newString)
>>>  }
>>>  new.vec
>>> }
>>> ------------------------------------------------------------
>>> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
>>> Indiana University School of Medicine
>>>
>>> 15032 Hunter Court, Westfield, IN 46074
>>>
>>> (317) 490-5129 Work, & Mobile & VoiceMail
>>> (317) 399-1219 Home
>>> Skype: mkimpel
>>>
>>> ******************************************************************
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list