[R] string problems ( grep and regepxr)

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Mar 24 13:07:18 CET 2004


On Wed, 24 Mar 2004, MMarques Power wrote:

> 
> Recently working with strings and data
> I have found a small problem.
> 
> Windows XP
> R 1.8.1
> 
> Reading data from a "txt file" with readLine.
> finding a specific line with "grep" command, all OK.
> but here comes the problem...
> After finding the correct line(s) i need to find a substring
> inside each string.
> In this case "tabs" I think it represented by "\t" in the grep command
> trying to use grep in each string it only returns 1 ...

That says it is present in character element one.  Do read the help page

Value:

     For 'grep' a vector giving either the indices of the elements of
     'x' that yielded a match or, if 'value' is 'TRUE', the matched
     elements.


> Afterwards I tried regexpr command it returns the correct position of the
> substring that I am looking for but it only reports the first one.
> does regexpr only returns the first one ?

Yes.

> Partial example:
> 
> d5 = "load0004   node0014        0.05    0.014583333"
>      "load0005   node0017        0.05    0.014583333"
>      "load0006   node0019        0.05    0.014583333"
> 
>      
> >grep("\t",d5[1])
> [1] 1
> >regexpr("\t",d5[1]
> [1] 9
> attr(,"match.length")
> [1] 1
> 
> any idea how to make regexpr return the several substrings ?
> or the grep and
> Am I missing anything obvious ?

Telling us what you actually want to do!  Would

sapply(strsplit(d5, "\t"), length)

be closer to what you have in mind?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list