Comparison {base} | R Documentation |
Relational Operators
Description
Binary operators which allow the comparison of values in atomic vectors.
Usage
x < y
x > y
x <= y
x >= y
x == y
x != y
Arguments
x , y |
atomic vectors, symbols, calls, or other objects for which methods have been written. |
Details
The binary comparison operators are generic functions: methods can be
written for them individually or via the
Ops
group generic function. (See
Ops
for how dispatch is computed.)
Comparison of strings in character vectors is lexicographic within the
strings using the collating sequence of the locale in use: see
locales
. The collating sequence of locales such as
‘en_US’ is normally different from ‘C’ (which should use
ASCII) and can be surprising. Beware of making any assumptions
about the collation order: e.g. in Estonian Z
comes between
S
and T
, and collation is not necessarily
character-by-character – in Danish aa
sorts as a single
letter, after z
. In Welsh ng
may or may not be a single
sorting unit: if it is it follows g
. Some platforms may
not respect the locale and always sort in numerical order of the bytes
in an 8-bit locale, or in Unicode code-point order for a UTF-8 locale (and
may not sort in the same order for the same language in different
character sets). Collation of non-letters (spaces, punctuation signs,
hyphens, fractions and so on) is even more problematic.
Character strings can be compared with different marked encodings
(see Encoding
): they are translated to UTF-8 before
comparison.
Raw vectors should not really be considered to have an order, but the numeric order of the byte representation is used.
At least one of x
and y
must be an atomic vector, but if
the other is a list R attempts to coerce it to the type of the atomic
vector: this will succeed if the list is made up of elements of length
one that can be coerced to the correct type.
If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
Missing values (NA
) and NaN
values are
regarded as non-comparable even to themselves, so comparisons
involving them will always result in NA
. Missing values can
also result when character strings are compared and one is not valid
in the current collation locale.
Language objects such as symbols and calls can only be used as
operands for ==
and !=
; the other comparisons signal an
error when one of the operands is a language object. Currently
language objects are deparsed to character strings before
comparison. This can be inefficient and may not be what is really
wanted. For equality comparisons identical
is usually a
better choice.
Value
A logical vector indicating the result of the element by element comparison. The elements of shorter vectors are recycled as necessary.
Objects such as arrays or time-series can be compared this way provided they are conformable.
S4 methods
These operators are members of the S4 Compare
group generic,
and so methods can be written for them individually as well as for the
group generic (or the Ops
group generic), with arguments
c(e1, e2)
.
Note
Do not use ==
and !=
for tests, such as in if
expressions, where you must get a single TRUE
or
FALSE
. Unless you are absolutely sure that nothing unusual
can happen, you should use the identical
function
instead.
For numerical and complex values, remember ==
and !=
do
not allow for the finite representation of fractions, nor for rounding
error. Using all.equal
with identical
or
isTRUE
is almost always preferable; see the examples.
(This also applies to the other comparison operators.)
These operators are sometimes called as functions as
e.g. `<`(x, y)
: see the description of how
argument-matching is done in Ops
.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Collation of character strings is a complex topic. For an introduction see https://en.wikipedia.org/wiki/Collating_sequence. The Unicode Collation Algorithm (https://unicode.org/reports/tr10/) is likely to be increasingly influential. Where available R by default makes use of ICU (https://icu.unicode.org/) for collation (except in a C locale).
See Also
Logic
on how to combine results of comparisons,
i.e., logical vectors.
factor
for the behaviour with factor arguments.
Syntax
for operator precedence.
capabilities
for whether ICU is available, and
icuSetCollate
to tune the string collation algorithm
when it is.
Examples
x <- stats::rnorm(20)
x < 1
x[x > 0]
x1 <- 0.5 - 0.3
x2 <- 0.3 - 0.1
x1 == x2 # FALSE on most machines
isTRUE(all.equal(x1, x2)) # TRUE everywhere
# range of most 8-bit charsets, as well as of Latin-1 in Unicode
z <- c(32:126, 160:255)
x <- if(l10n_info()$MBCS) {
intToUtf8(z, multiple = TRUE)
} else rawToChar(as.raw(z), multiple = TRUE)
## by number
writeLines(strwrap(paste(x, collapse=" "), width = 60))
## by locale collation
writeLines(strwrap(paste(sort(x), collapse=" "), width = 60))