[Rd] Regression in match() in R 3.3.0 when matching strings with different character encodings
Kirill Müller
kirill.mueller at ivt.baug.ethz.ch
Mon May 9 16:07:21 CEST 2016
Hi
I think the following behavior is a regression from R 3.2.5:
> match(iconv( c("\u00f8", "A"), from = "UTF8", to = "latin1" ),
"\u00f8")
[1] 1 NA
> match(iconv( c("\u00f8"), from = "UTF8", to = "latin1" ), "\u00f8")
[1] NA
> match(iconv( c("\u00f8"), from = "UTF8", to = "latin1" ), "\u00f8",
incomparables = NA)
[1] 1
I'm seeing this in R 3.3.0 on both Windows and Ubuntu 15.10.
The specific behavior makes me think this is related to the following
NEWS entry:
match(x, table) is faster (sometimes by an order of magnitude) when x is
of length one and incomparables is unchanged (PR#16491).
Best regards
Kirill
More information about the R-devel
mailing list