[R] agrep behavior

Eduardo Leoni leoniedu at msu.edu
Wed Jun 24 21:12:58 CEST 2009


Thanks David, for trying to replicate. May be it is a 2.9 problem? I
should have attached session info:

> agrep("Staatssekretar im Bundeskanzleramt","Bundeskanzler",max.distance=.9)
integer(0)
> agrep("Staatssekretar im Bundeskanzleramt","Bundeskanzler",max.distance=.7)
[1] 1

sessionInfo()
R version 2.9.0 (2009-04-17)
i386-apple-darwin8.11.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

The same problem occurs on windows:

> sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base





On Wed, Jun 24, 2009 at 2:57 PM, David Winsemius<dwinsemius at comcast.net> wrote:
> Unable to reproduce:
>
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.6)
> [1] 1
>>
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.89)
> [1] 1
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.9)
> [1] 1
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.99)
> [1] 1
>
>
> Using integers the threshold is between 20 and 21. Something to do mit
> encodings?
>
>> sessionInfo()
> R version 2.8.1 Patched (2009-01-19 r47650)
> i386-apple-darwin9.6.0
>
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> On Jun 24, 2009, at 2:06 PM, Eduardo Leoni wrote:
>
>> Dear list -
>>
>> I am a bit puzzled by the behavior of agrep:
>>
>> The following command finds a match:
>>
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.6)
>>
>> But if I _increase_ the maximum distance to .9 it fails:
>>
>> agrep("Staatssekretar im
>> Bundeskanzleramt","Bundeskanzler",max.distance=.9)
>>
>> What am I missing? (If we use integers in max.distance the threshold
>> is between 30 and 31)
>>
>> -Eduardo
>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>



-- 
Science is the art of the soluble. (Peter Medawar)




More information about the R-help mailing list