[Rd] class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Nov 21 17:57:51 CET 2019
TLDR: This is quite technical, still somewhat important:
1) R 4.0.0 will become a bit more coherent: a matrix is an array
2) Your package (or one you use) may be affected.
>>>>> Martin Maechler
>>>>> on Fri, 15 Nov 2019 17:31:15 +0100 writes:
>>>>> Pages, Herve
>>>>> on Thu, 14 Nov 2019 19:13:47 +0000 writes:
>> On 11/14/19 05:47, Hadley Wickham wrote:
>>> On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler ... wrote:
[................]
>>>>> Note again that both "matrix" and "array" are special [see ?class] as
>>>>> being of __implicit class__ and I am considering that this
>>>>> implicit class behavior for these two should be slightly
>>>>> changed ....
>>>>>
>>>>> And indeed I think you are right on spot and this would mean
>>>>> that indeed the implicit class
>>>>> "matrix" should rather become c("matrix", "array").
>>>>
>>>> I've made up my mind (and not been contradicted by my fellow R
>>>> corers) to try go there for R 4.0.0 next April.
>>> I can't seem to find the previous thread, so would you mind being a
>>> bit more explicit here? Do you mean adding "array" to the implicit
>>> class?
>> It's late in Europe ;-)
>> That's my understanding. I think the plan is to have class(matrix())
>> return c("matrix", "array"). No class attributes added to matrix or
>> array objects.
>> It's all what is needed to have inherits(matrix(), "array") return TRUE
>> (instead of FALSE at the moment) and S3 dispatch pick up the foo.array
>> method when foo(matrix()) is called and there is no foo.matrix method.
> Thank you, Hervé! That's exactly the plan.
BUT it's wrong what I (and Peter and Hervé and ....) had assumed:
If I just change the class
(as I already did a few days ago, but you must activate the change
via environment variable, see below),
S3 dispatch does *NOT* at all pick it up:
"matrix" (and "array") are even more special here (see below),
and from Hadley's questions, in hindsight I now see that he's been aware
of that and I hereby apologize to Hadley for not having thought
and looked more, when he asked ..
Half an hour ago, I've done another source code commit (svn r77446),
to "R-devel" only, of course, and the R-devel NEWS now starts as
------------------------------------------------------------
CHANGES IN R-devel:
USER-VISIBLE CHANGES:
• .... intention that the next non-patch release should be 4.0.0.
• R now builds by default against a PCRE2 library ........
...................
...................
• For now only active when environment variable
_R_CLASS_MATRIX_ARRAY_ is set to non-empty, but planned to be the
new unconditional behavior when R 4.0.0 is released:
Newly, matrix objects also inherit from class "array", namely,
e.g., class(diag(1)) is c("matrix", "array") which invalidates
code (wrongly) assuming that length(class(obj)) == 1, a wrong
assumption that is less frequently fulfilled now. (Currently
only after setting _R_CLASS_MATRIX_ARRAY_ to non-empty.)
S3 methods for "array", i.e., <someFun>.array(), are now also
dispatched for matrix objects.
------------------------------------------------------------
(where only the very last 1.5 lines paragraph is new.)
Note the following
(if you use a version of R-devel, with svn rev >= 77446; which
you may get as a binary for Windows in about one day; everyone
else needs to compile for the sources .. or wait a bit, maybe
also not much longer than one day, for a docker image) :
> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_") # ==> current R behavior
> class(m <- diag(1))
[1] "matrix"
> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
> class(m)
[1] "matrix" "array"
>
> foo <- function(x) UseMethod("foo")
> foo.array <- function(x) "made in foo.array()"
> foo(m)
[1] "made in foo.array()"
> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_")# ==> current R behavior
> foo(m)
Error in UseMethod("foo") :
no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')"
> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = TRUE) # ==> future R behavior
> foo(m)
[1] "made in foo.array()"
> foo.A <- foo.array ; rm(foo.array)
> foo(m)
Error in UseMethod("foo") :
no applicable method for 'foo' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
>
So, with my commit 77446, the _R_CLASS_MATRIX_ARRAY_
environment variable also changes the
"S3 dispatch determining class"
mentioned as 'class' in the error message (of the two cases, old
and new) above, which in R <= 3.6.x for a numeric matrix is
c('matrix', 'double', 'numeric')
and from R 4.0.0 on will be
c('matrix', 'array', 'double', 'numeric')
Note that this is *not* (in R <= 3.6.x, nor very probably in R 4.0.0)
the same as R's class().
Hadley calls this long class vector the 'implicit class' -- which
is a good term but somewhat conflicting with R's (i.e. R-core's)
"definition" used in the ?class help page (for ca. 11 years).
R's internal C code has a nice function class R_data_class2()
which computes this 'S3-dispatch-class' character (vector) for
any R object, and R_data_class2() is indeed called from (the
underlying C function of) R's UseMethod().
Using the above fact of an error message,
I wrote a nice (quite well tested) function my.class2() which
returns this S3_dispatch_class() also in current versions of R:
my.class2 <- function(x) { # use a fn name not used by any sane ..
foo.7.3.343 <- function(x) UseMethod("foo.7.3.343")
msg <- tryCatch(foo.7.3.343(x), error=function(e) e$message)
clm <- sub('"$', '', sub(".* of class \"", '', msg))
if(is.language(x) || is.function(x))
clm
else {
cl <- str2lang(clm)
if(is.symbol(cl)) as.character(cl) else eval(cl)
}
}
## str2lang() needs R >= 3.6.0:
if(getRversion() < "3.6.0") ## substitute for str2lang(), good enough here:
str2lang <- function(s) parse(text = s, keep.source=FALSE)[[1]]
Now you can look at such things yourself:
## --------------------- the "interesting" cases : ---
## integer and double
my.class2( pi) # == c("double", "numeric")
my.class2(1:2) # == c("integer", "numeric")
## matrix and array [also combined with int / double ] :
my.class2(matrix(1L, 2,3)) # == c(matrixCL, "integer", "numeric") <<<
my.class2(matrix(pi, 2,3)) # == c(matrixCL, "double", "numeric") <<<
my.class2(array("A", 2:3)) # == c(matrixCL, "character") <<<
my.class2(array(1:24, 2:4)) # == c("array", "integer", "numeric")
my.class2(array( pi , 2:4)) # == c("array", "double", "numeric")
my.class2(array(TRUE, 2:4)) # == c("array", "logical")
my.class2(array(letters, 2:4)) # == c("array", "character")
my.class2(array(1:24 + 1i, 2)) # == c("array", "complex")
## other cases
my.class2(NA) # == class(NA) : "logical"
my.class2("A") # == class("B"): "character"
my.class2(as.raw(0:2)) # == "raw"
my.class2(1 + 2i) # == "complex"
my.class2(USJudgeRatings)#== "data.frame"
my.class2(class) # == "function" # also for a primitive
my.class2(globalenv()) # == "environment"
my.class2(quote(sin(x)))# == "call"
my.class2(quote(sin) ) # == "name"
my.class2(quote({})) # == class(*) == "{"
my.class2(quote((.))) # == class(*) == "("
-----------------------------------------------------
note that of course, the lines marked "<<<" above, contain
'matrixCL' which is "matrix" in "old" (i.e. current) R,
and is c("matrix", "array") in "new" (i.e. future) R.
Last but not least: It's quite trivial (only few words need to
be added to the sources; more to the documentation) to add an R
function to base R which provides the same as my.class2() above,
(but much more efficiently, not via catching error messages !!),
and my current proposal for that function's name is .class2()
{it should start with a dot ("."), as it's not for the simple
minded average useR ... and you know how I'm happy with
function names that do not need one single [Shift] key ...}
The current plan contains
1) Notify CRAN package maintainers (ca 140) whose packages no
longer pass R CMD check when the feature is turned on
(via setting the environment variable) in R-devel.
2a) (Some) CRAN team members set _R_CLASS_MATRIX_ARRAY_ (to non-empty),
as part of the incoming checks, at least for all new CRAN submissions
2b) set the _R_CLASS_MATRIX_ARRAY_ (to non-empty), as part of
' R CMD check --as-cran <pkg>'
3) Before the end of 2019, change the R sources (for R-devel)
such that it behaves as it behaves currently when the environment
variable is set *AND* abolish this environment variable from
the sources. {read on to learn *why*}
Consequently (to 3), R 4.0.0 will behave as indicated, unconditionally.
Note that (as I've shown above in the first example set) this is
set up in such a manner that you can change the environment
variable during a *running* R session, and observe the effect immediately.
This however lead to some slow down of quite a bit of the R
code, because actually the environment variable has to be
checked quite often (easily dozens of times for simple R calls).
For that reason, we want to do "3)" as quickly as possible.
Please do not hesitate to ask or comment
-- here, not on Twitter, please -- noting that I'll be
basically offline for an extended weekend within 24h, now.
I hope this will eventually to lead to clean up and clarity in
R, and hence should be worth the pain of broken
back-compatibility and having to adapt your (almost always only
sub-optimally written ;-)) R code,
see also my Blog http://bit.ly/R_blog_class_think_2x
Martin Maechler
ETH Zurich and R Core team
More information about the R-devel
mailing list