aregexec {utils} R Documentation

## Approximate String Match Positions

### Description

Determine positions of approximate string matches.

### Usage

aregexec(pattern, text, max.distance = 0.1, costs = NULL,
ignore.case = FALSE, fixed = FALSE, useBytes = FALSE)


### Arguments

 pattern a non-empty character string or a character string containing a regular expression (for fixed = FALSE) to be matched. Coerced by as.character to a string if possible. text character vector where matches are sought. Coerced by as.character to a character vector if possible. max.distance maximum distance allowed for a match. See agrep. costs cost of transformations. See agrep. ignore.case a logical. If TRUE, case is ignored for computing the distances. fixed If TRUE, the pattern is matched literally (as is). Otherwise (default), it is matched as a regular expression. useBytes a logical. If TRUE comparisons are byte-by-byte rather than character-by-character.

### Details

aregexec provides a different interface to approximate string matching than agrep (along the lines of the interfaces to exact string matching provided by regexec and grep).

Note that by default, agrep performs literal matches, whereas aregexec performs regular expression matches.

See agrep and adist for more information about approximate string matching and distances.

Comparisons are byte-by-byte if pattern or any element of text is marked as "bytes".

### Value

A list of the same length as text, each element of which is either -1 if there is no match, or a sequence of integers with the starting positions of the match and all substrings corresponding to parenthesized subexpressions of pattern, with attribute "match.length" an integer vector giving the lengths of the matches (or -1 for no match).

regmatches for extracting the matched substrings.

### Examples

## Cf. the examples for agrep.
x <- c("1 lazy", "1", "1 LAZY")
aregexec("laysy", x, max.distance = 2)
aregexec("(lay)(sy)", x, max.distance = 2)
aregexec("(lay)(sy)", x, max.distance = 2, ignore.case = TRUE)
m <- aregexec("(lay)(sy)", x, max.distance = 2)
regmatches(x, m)


[Package utils version 4.3.0 Index]