[Rd] Future plans for raw data type?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Sep 28 11:59:39 CEST 2005
On Tue, 27 Sep 2005 dhinds at sonic.net wrote:
> I've been working with raw vectors quite a bit and was wondering if
> the R team might comment on where they see raw vector support going in
> the long run. Is the intent that 'raw' will eventually become a first
> class data type on the same level as 'integer'? Or should 'raw' have
> more limited support, by design?
They _are_ `first class data types', atomic vectors, just like integers.
The intent remains that their contents should not be interpreted, just as
in the Green Book. One comsequential difference from other atomic vectors
is that there is no notion of NA for raw elements.
This means that there are basically no plans to add support for
manipulation of raw vectors. We have already gone quite a lot further
than S does, and quite a few things have been considered undesirable (see
below).
> For example, with very minor changes to subassign.c to implement some
> automatic coercions, raw vectors can become arguments to ifelse() and
> can be members of data frames. Would this be desirable?
It is desirable that they can be members of data frames, which is why they
_can_ be:
> y <- charToRaw("test")
> z <- data.frame(y)
format() was not handling raw until recently, but now does. Thus z can
now be printed. (Again, it is somewhat dubious that one should be able to
format/print raw vectors as that imposes an interpretation, but it is
convenient.)
ifelse() is coded in a peculiar way that needs logical to be coercible
(for some values of 'test') to a common mode for 'yes' and 'no'.
Alternatives are given on its help page.
Given that you cannot interpret raw elements, you cannot unambiguously
coerce logical to raw. In particular there is no way to coerce logical NA
to raw. So what should ifelse(NA, yes, no) be? There is no good answer,
which is why the status quo is desirable. (as.raw warns if you attempt
this.)
You are vague as to which `automatic coercions' you think could be added,
but at least this one was deliberately not added.
Digging around I did find one unanticipated problem. If z is a list z$a
<- raw_vector works but z[["a"]] <- raw_vector does not. The reason is
that for atomic vectors the latter first coerces the rhs to a list and
then extracts the first element. Which is clearly wasteful (and not
documented), and I will take a closer look at it for 2.3.0, but I've added
sticking plaster for 2.2.0.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list