[Rd] splitting strings efficiently
Henrik Bengtsson
hb at stat.berkeley.edu
Thu Sep 25 00:39:20 CEST 2008
For strsplit(), note that fixed=TRUE is much faster. /HB
On Wed, Sep 24, 2008 at 9:20 AM, Mark Kimpel <mwkimpel at gmail.com> wrote:
> I knew there HAD to be a basic function, but 'help.search("split string")'
> and 'help("string") did not find it. Thanks for the help on this elementary
> question.
> Mark
> ------------------------------------------------------------
> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN 46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 399-1219 Home
> Skype: mkimpel
>
> ******************************************************************
>
>
> On Wed, Sep 24, 2008 at 12:17 PM, Erik Iverson <iverson at biostat.wisc.edu>wrote:
>
>> ?strsplit
>>
>> Mark Kimpel wrote:
>>
>>> I have a very long list of strings. Each string actually contains multiple
>>> values separated by a semi-colon. I need to turn each string into a vector
>>> of the values delimited by the semi-colons. I know I can do this very
>>> laboriously by using loops, nchar, and substr, but it is terribly slow. Is
>>> there a basic R function that handles this situation? If not, is there
>>> perhaps a faster way to do it than I currently am, which is to lapply the
>>> following function? Thanks, Mark
>>>
>>>
>>> #######################################################################################
>>> string.tokenizer.func<-function(string, separator){
>>> new.vec<- NULL
>>> newString<- ""
>>> if(is.null(string)) {new.vec<-""} else {
>>> for(i in 1:(nchar(string) + 1)){
>>> if(substr(string, i, i) == separator){
>>> new.vec<-c(new.vec,newString)
>>> newString <- ""
>>> } else {
>>> newString<-paste(newString, substr(string, i, i), sep="")
>>> }
>>> }
>>> new.vec<-c(new.vec,newString)
>>> }
>>> new.vec
>>> }
>>> ------------------------------------------------------------
>>> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
>>> Indiana University School of Medicine
>>>
>>> 15032 Hunter Court, Westfield, IN 46074
>>>
>>> (317) 490-5129 Work, & Mobile & VoiceMail
>>> (317) 399-1219 Home
>>> Skype: mkimpel
>>>
>>> ******************************************************************
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list