[R] count occurrence and distance of characters in string

William Dunlap wdunlap at tibco.com
Fri Nov 5 03:44:17 CET 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Immanuel
> Sent: Thursday, November 04, 2010 4:54 PM
> To: Charles C. Berry
> Cc: r-help at r-project.org
> Subject: Re: [R] count occurrence and distance of characters in string
> 
> Hey,
> 
> thanks for the answer, actually I already typed an example
> but deleted it since I thought it's superfluous.
> regards
> 
> ---------
> string <- "kjokllokkoadddo"
> 
> # f1(string, "o") should return that "o" was found 4 times
> # f2(string, "o") should return that the distances between the "o"'s
> found is 3 , 2, 4
> ---------

Try gregexpr():
  > string <- "kjokllokkoadddo"
  > gregexpr("o", string)
  [[1]]
  [1]  3  7 10 15
  attr(,"match.length")
  [1] 1 1 1 1

  > gregexpr("o", c("kjokllokkoadddo", "ooofoo", "abcde"))
  [[1]]
  [1]  3  7 10 15
  attr(,"match.length")
  [1] 1 1 1 1

  [[2]]
  [1] 1 2 3 5 6
  attr(,"match.length")
  [1] 1 1 1 1 1

  [[3]] 
  [1] -1
  attr(,"match.length")
  [1] -1

Postprocess its output with length and diff to get what
your want.  E.g.,

  > g <- gregexpr("o", c("kjokllokkoadddo", "ooofoo", "abcde"))
  > sapply(g,length)
  [1] 4 5 1
  > lapply(g,function(x)diff(x)-1)
  [[1]]
  [1] 3 2 4

  [[2]]
  [1] 0 0 1 0

  [[3]]
  numeric(0)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> 
> On 11/05/2010 12:28 AM, Charles C. Berry wrote:
> > On Thu, 4 Nov 2010, Immanuel wrote:
> >
> >> Hello all,
> >>
> >> I want to know how often one character occurs in a given string
> >> and the distance from between every two occurences. 
> (distance = other
> >> characters between them).
> >
> > You should provide "commented, minimal, self-contained, reproducible
> > code" as asked.
> >
> > And especially for a question like this one with many simple answers
> > that RespondeRs will shower you with if only you give them 
> a starting
> > point.
> >
> > Use tapply, strsplit, seq, nchar, unlist, diff, "-", and 
> table for one
> > way.
> >
> > Chuck
> >
> >>
> >> thanks
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > Charles C. Berry                            Dept of 
> Family/Preventive
> > Medicine
> > cberry at tajo.ucsd.edu                UC San Diego
> > http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
> > 92093-0901
> >
> >
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list