R-beta: S Compatibility (again)
Bill Venables
wvenable at attunga.stats.adelaide.edu.au
Sun Apr 12 09:50:18 CEST 1998
While I think total compatibility is not possible or even very
important I have become aware recently of some fairly low level
incompatibilities that have rather serious consequences for
interchange between systems. Some of these are possibly well
known, but I put them on on record here just in case. (I do not
expect any action on these, of course, but people may want to
note them, and correct me if I'm wrong.)
1. If you need to deal with non-printable characters in S you
can use numeric escape sequences such as "\001" for "^A"
(control-A) and when such characters occur in dumped objects
they become these numeric escapes.
R neither reads nor generates numeric escape sequences, (with
the possible exception of the weirdo "\300" above, which it
cannot read, of course).
When an object containing non-printable character components
is dumped in R those characters are dumped literally, except
for \007 to \015, which have alpha escape sequences:
R:
> ascii[8:14]
[1] "\a" "\b" "\t" "\n" "\v" "\f" "\r"
In S only some of these alpha escape sequences are recognised
or generated:
S:
> ascii[8:14]
[1] "\007" "\b" "\t" "\n" "\013" "\014" "\r"
Thus "\a" reads as \007 (bel) in R but as "a" in S. Also
"\007" in a dump file reads as ^G (control-G) in S, but as
"007" (a 3-character string) in R.
I think S has the better convention since dump files are in
printable ascii and so not so susceptable to email
transmission problems. R dump files should be treated as
binaries and alpha encoded.
Curiously S can almost read R dump files but not conversely.
The place where S fails is when R generates an "\a", "\v" or
"\f"; literal control characters are readable by S as well as
numeric escape sequences.
2. substring(...., first = 0, ...) is not the same.
R: S:
> n > n
[1] "" "a" "b" "c" [1] "" "a" "b" "c"
> nchar(n) > nchar(n)
[1] 0 1 1 1 [1] 0 1 1 1
> substring(n, 0, nchar(n)) > substring(n, 0, nchar(n))
[1] "@" "" "" "" [1] "" "a" "b" "c"
> substring(n, 1, nchar(n)) > substring(n, 1, nchar(n))
[1] "" "a" "b" "c" [1] "" "a" "b" "c"
3. Language manipulation within R seems to be impossible. I
realise this may be a design limitation that nobody can do
anything about, but it may be worth noting.
To be more specific, the R substitute() is much more limited
than the S version and coercion to mode "{", "call" or
"function" are unavailable, and function objects are not
subsetable. as.call is useless and as.function only exists to
issue a rather tetchy error message.
For example, as far as I can see it is impossible to write a
version of the S function deriv() for R, short of getting down
to brass tacks and writing a new primitive into the base code,
but even then you can't extend it, of course.
4. I miss find() and data.dump() in R very badly, and the
curiously different argument sequence for objects() is
disconcerting to someone who has to use both systems.
Bill
--
Bill Venables, Head, Dept of Statistics, Tel.: +61 8 8303 5418
University of Adelaide, Fax.: +61 8 8303 3696
South AUSTRALIA. 5005. Email: Bill.Venables at adelaide.edu.au
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list