R-beta: S Compatibility (again)

Sun Apr 12 09:50:18 CEST 1998

While I think total compatibility is not possible or even very
important I have become aware recently of some fairly low level
incompatibilities that have rather serious consequences for
interchange between systems.  Some of these are possibly well
known, but I put them on on record here just in case.  (I do not
expect any action on these, of course, but people may want to
note them, and correct me if I'm wrong.)

1. If you need to deal with non-printable characters in S you
   can use numeric escape sequences such as "\001" for "^A"
   (control-A) and when such characters occur in dumped objects
   they become these numeric escapes.

   R neither reads nor generates numeric escape sequences, (with
   the possible exception of the weirdo "\300" above, which it
   cannot read, of course).

   When an object containing non-printable character components
   is dumped in R those characters are dumped literally, except
   for \007 to \015, which have alpha escape sequences:
R:

> ascii[8:14]
[1] "\a" "\b" "\t" "\n" "\v" "\f" "\r"

   In S only some of these alpha escape sequences are recognised
   or generated:

S:

> ascii[8:14]
[1] "\007" "\b" "\t" "\n" "\013" "\014" "\r"

   Thus "\a" reads as \007 (bel) in R but as "a" in S.  Also
   "\007" in a dump file reads as ^G (control-G) in S, but as
   "007" (a 3-character string) in R.

   I think S has the better convention since dump files are in
   printable ascii and so not so susceptable to email
   transmission problems.  R dump files should be treated as
   binaries and alpha encoded.

   Curiously S can almost read R dump files but not conversely.
   The place where S fails is when R generates an "\a", "\v" or
   "\f"; literal control characters are readable by S as well as
   numeric escape sequences.

2. substring(...., first = 0, ...) is not the same.

R:                                 S:                            

> n				   > n                           
[1] ""  "a" "b" "c"		   [1] ""  "a" "b" "c"           
> nchar(n)			   > nchar(n)                    
[1] 0 1 1 1			   [1] 0 1 1 1                   
> substring(n, 0, nchar(n))	   > substring(n, 0, nchar(n))   
[1] "@" ""  ""  "" 		   [1] ""  "a" "b" "c"           
> substring(n, 1, nchar(n))	   > substring(n, 1, nchar(n))   
[1] ""  "a" "b" "c"		   [1] ""  "a" "b" "c"           

3. Language manipulation within R seems to be impossible.  I
   realise this may be a design limitation that nobody can do
   anything about, but it may be worth noting.

   To be more specific, the R substitute() is much more limited
   than the S version and coercion to mode "{", "call" or
   "function" are unavailable, and function objects are not
   subsetable.  as.call is useless and as.function only exists to
   issue a rather tetchy error message.

   For example, as far as I can see it is impossible to write a
   version of the S function deriv() for R, short of getting down
   to brass tacks and writing a new primitive into the base code,
   but even then you can't extend it, of course.

4. I miss find() and data.dump() in R very badly, and the
   curiously different argument sequence for objects() is
   disconcerting to someone who has to use both systems.

Bill

-- 
Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
South AUSTRALIA.     5005.   Email: Bill.Venables at adelaide.edu.au

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._