[Rd] Pb with agrep()
Martin Maechler
maechler at stat.math.ethz.ch
Thu Jan 5 10:02:21 CET 2006
>>>>> "Herve" == Herve Pages <hpages at fhcrc.org>
>>>>> on Wed, 04 Jan 2006 17:29:35 -0800 writes:
Herve> Happy new year everybody,
Herve> I'm getting the following while trying to use the agrep() function:
>> pattern <- "XXX"
>> subject <- c("oooooo", "oooXooo", "oooXXooo", "oooXXXooo")
>> max <- list(ins=0, del=0, sub=0) # I want exact matches only
>> agrep(pattern, subject, max=max)
Herve> [1] 4
Herve> OK
>> max$sub <- 1 # One allowed substitution
>> agrep(pattern, subject, max=max)
Herve> [1] 3 4
Herve> OK
>> max$sub <- 2 # Two allowed substitutions
>> agrep(pattern, subject, max=max)
Herve> [1] 3 4
Herve> Wrong!
No.
You have overlooked the fact that 'max.distance = 0.1' (10%)
*remains* the default, even when 'max.distance' is specified as
a list as in your example [from "?agrep" ] :
>> max.distance: Maximum distance allowed for a match. Expressed either
>> as integer, or as a fraction of the pattern length (will be
>> replaced by the smallest integer not less than the
>> corresponding fraction), or a list with possible components
>>
>> 'all': maximal (overall) distance
>>
>> 'insertions': maximum number/fraction of insertions
>>
>> 'deletions': maximum number/fraction of deletions
>>
>> 'substitutions': maximum number/fraction of substitutions
>>
>>>>>> If 'all' is missing, it is set to 10%, the other components
>>>>>> default to 'all'. The component names can be abbreviated.
If you specify max$all as "100%", i.e, as 0.9999 ('< 1' !) everything works
as you expect it:
agrep(pattern, subject, max = list(ins=0, del=0, sub= 2, all = 0.9999))
## --> 2 3 4
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list