[Rd] '==' operator: inconsistency in data.frame(...) == NULL
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Sep 11 09:56:37 CEST 2019
>>>>> Hilmar Berger
>>>>> on Wed, 4 Sep 2019 15:25:46 +0200 writes:
> Dear all,
> I just stumbled upon some behavior of the == operator which is at least
> somewhat inconsistent.
> R version 3.6.1 (2019-07-05) -- "Action of the Toes"
> Copyright (C) 2019 The R Foundation for Statistical Computing
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> list(a=1:3, b=LETTERS[1:3]) == NULL
> logical(0)
>> matrix(1:6, 2,3) == NULL
> logical(0)
>> data.frame(a=1:3, b=LETTERS[1:3]) == NULL # same for == logical(0)
> Error in matrix(if (is.null(value)) logical() else value, nrow = nr,
> dimnames = list(rn, :
> length of 'dimnames' [2] not equal to array extent
>> data.frame(NULL) == 1
> <0 x 0 matrix>
>> data.frame(NULL) == NULL
> <0 x 0 matrix>
>> data.frame(NULL) == logical(0)
> <0 x 0 matrix>
> I wonder if data.frame(<some non-empty data>) == NULL should also return
> a value instead of an error. R help reads:
> "At least one of |x| and |y| must be an atomic vector, but
> if the other is a list R attempts to coerce it to the
> type of the atomic vector: this will succeed if the list
> is made up of elements of length one that can be coerced
> to the correct type.
> If the two arguments are atomic vectors of different
> types, one is coerced to the type of the other, the
> (decreasing) order of precedence being character, complex,
> numeric, integer, logical and raw."
> It is not clear from the help what to expect for NULL or
> empty atomic vectors.
Well, strictly speaking an error would be expected for NULL,
as it is *not* an atomic vector, and your main issue
" data.frame(..) == NULL "
would already be settled by the first half sentence from the
doc, and strictly speaking, even data.frame(NULL) == NULL
"should" return an error ((Note: I'm not saying it really
should, but at least the reference does not say it should work at all))
Now, logical(0) on the other hand *is* an atomic vector ...
> It is also strange that for list()
> there is no error but for data.frame() with the same data
> an error is thrown. I can see that there might be reasons
> to return logical(0) instead of FALSE, but I do not fully
> understand why there should be differences between
> e.g. matrix() and data.frame().
Well, a [regular base R] matrix() is atomic and a data frame is not.
> Also, It is at least somewhat strange that
> data.frame(NULL) == NULL and similar expressions return an
> empty matrix, while comparing a normal filled matrix to
> NULL returns logical(0).
> Even if this behavior is expected, the error message shown
> by data.frame(...) == NULL is not very informative.
I'm not at all sure there's any need for a change here.
I would say the following general thinking should be applied
1. The general rule that '==' should be used only for comparing
atomic objects (as it returns an atomic object, a 'logical' with
corresponding attributes), is really principal
and using '==' for anything else has never been "the idea".
2. There are (two) "semi-exceptions" to the above:
2a) Sometimes it has been convenient to treat NULL as if it was
a zero-length atomic object (of "arbitrary" type/mode).
2b) data.frame()s "should typically" behave like matrices in
many situations, notably when indexed {and that rule is
violated (on purpose) by tibbles .. ("drop=FALSE" etc, but
that's another story)}
So because of these exceptions, you and possibly others may
think '==' should "work" with data.frame()s and/or NULL, but
I would not tend to agree.
> Thanks and best regards,
> Hilmar
You are welcome!
Martin
More information about the R-devel
mailing list