[Rd] True length - length(unclass(x)) - without having to call unclass()?

Iñaki Ucar iuc@r @ending from fedor@project@org
Mon Sep 10 14:30:47 CEST 2018


El lun., 10 sept. 2018 a las 14:18, Tomas Kalibera
(<tomas.kalibera using gmail.com>) escribió:
>
> On 09/05/2018 11:18 AM, Iñaki Ucar wrote:
> > The bottomline here is that one can always call a base method,
> > inexpensively and without modifying the object, in, let's say,
> > *formal* OOP languages. In R, this is not possible in general. It
> > would be possible if there was always a foo.default, but primitives
> > use internal dispatch.
> >
> > I was wondering whether it would be possible to provide a super(x, n)
> > function which simply causes the dispatching system to avoid "n"
> > classes in the hierarchy, so that:
> >
> >> x <- structure(list(), class=c("foo", "bar"))
> >> length(super(x, 0)) # looks for a length.foo
> >> length(super(x, 1)) # looks for a length.bar
> >> length(super(x, 2)) # calls the default
> >> length(super(x, Inf)) # calls the default
> I think that a cast should always to be for a specific class, defined by
> the name of the class. Identifying classes by their inheritance index
> might be unnecessarily brittle - it would break if someone introduced a
> new ancestor class.

Agree. But just wanted to point out that, then, something like
super(x, "default") should always work to point to default methods,
even if a method is internal and there's no foo.default defined.
Otherwise, we would have the same problem.

Iñaki

> Apart from the syntax - supporting fast casts for S3
> dispatch in the current implementation would be quite a bit of work,
> probably not worth it, also it would probably slow down the internal
> dispatch in primitives. But a partial solution could be implemented at
> some point with ALTREP wrappers when one could without copying create a
> wrapper object with a modified class attribute.
>
> Tomas
> > Iñaki
> >
> > El mié., 5 sept. 2018 a las 10:09, Tomas Kalibera
> > (<tomas.kalibera using gmail.com>) escribió:
> >> On 08/24/2018 07:55 PM, Henrik Bengtsson wrote:
> >>> Is there a low-level function that returns the length of an object 'x'
> >>> - the length that for instance .subset(x) and .subset2(x) see? An
> >>> obvious candidate would be to use:
> >>>
> >>> .length <- function(x) length(unclass(x))
> >>>
> >>> However, I'm concerned that calling unclass(x) may trigger an
> >>> expensive copy internally in some cases.  Is that concern unfounded?
> >> Unclass() will always copy when "x" is really a variable, because the
> >> value in "x" will be referenced; whether it is prohibitively expensive
> >> or not depends only on the workload - if "x" is a very long list and
> >> this functions is called often then it could, but at least to me this
> >> sounds unlikely. Unless you have a strong reason to believe it is the
> >> case I would just use length(unclass(x)).
> >>
> >> If the copying is really a problem, I would think about why the
> >> underlying vector length is needed at R level - whether you really need
> >> to know the length without actually having the unclassed vector anyway
> >> for something else, so whether you are not paying for the copy anyway.
> >> Or, from the other end, if you need to do more without copying, and it
> >> is possible without breaking the value semantics, then you might need to
> >> switch to C anyway and for a bigger piece of code.
> >>
> >> If it were still just .length() you needed and it were performance
> >> critical, you could just switch to C and call Rf_length. That does not
> >> violate the semantics, just indeed it is not elegant as you are
> >> switching to C.
> >>
> >> If you stick to R and can live with the overhead of length(unclass(x))
> >> then there is a chance the overhead will decrease as R is optimized
> >> internally. This is possible in principle when the runtime knows that
> >> the unclassed vector is only needed to compute something that does not
> >> modify the vector. The current R cannot optimize this out, but it should
> >> be possible with ALTREP at some point (and as Radford mentioned pqR does
> >> it differently). Even with such internal optimizations indeed it is
> >> often necessary to make guesses about realistic workloads, so if you
> >> have a realistic workload where say length(unclass(x)) is critical, you
> >> are more than welcome to donate it as benchmark.
> >>
> >> Obviously, if you use a C version calling Rf_length, after such R
> >> optimization your code would be unnecessarily non-elegant, but would
> >> still work and probably without overhead, because R can't do much less
> >> than Rf_length. In more complicated cases though hand-optimized C code
> >> to implement say 2 operations in sequence could be slower than what
> >> better optimizing runtime could do by joining the effect of possibly
> >> more operations, which is in principle another danger of switching from
> >> R to C. But as far as the semantics is followed, there is no other danger.
> >>
> >> The temptation should be small anyway in this case when Rf_length()
> >> would be the simplest, but as I made it more than clear in the previous
> >> email, one should never violate the value semantics by temporarily
> >> modifying the object (temporarily removing the class attribute or
> >> temporarily remove the object bit). Violating semantics causes bugs, if
> >> not with the present then with future versions of R (where version may
> >> be an svn revision). A concrete recent example: modifying objects in
> >> place in violation of the semantics caused a lot of bugs with
> >> introduction of unification of constants in the byte-code compiler.
> >>
> >> Best
> >> Tomas
> >>
> >>> Thxs,
> >>>
> >>> Henrik
> >>>



More information about the R-devel mailing list