[Rd] Re: ichar() function in R : 1st implementation, RFC
Martin Maechler
maechler at stat.math.ethz.ch
Thu Oct 23 14:36:13 MEST 2003
(RFC := Request For Comments)
>>>>> "Tim" == Tim Keighley <Tim.Keighley at csiro.au>
>>>>> on Thu, 23 Oct 2003 11:45:22 +1000 writes:
Tim> Hi Martin,
Tim> In October 2000 you wrote to r-help:
>>> which reminds me that I've had a desire for something like
>>> the old S function [from the blue book, and library(examples) I think]
>>> ichar(ch)
>>> which would return a vector of integers, each the (decimal) equivalent of
>>> the (ISO-latin1) representation of the corresponding characters in ch.
>>>
>>> This should be easy enough (and be done in C).
>>> Any volunteers?
Tim> Did you get any volunteers?
no.
Thank you for reminding me!
Tim> Is this function or an
Tim> equivalent now available in R? I have been unsuccessful
Tim> in my investigation but I do not have all the available
Tim> CRAN packages, so it might be in one of them.
I've searched myself (we have almost all of them installed), and
didn't find anything.
Tim> Cheers,
Tim> Tim Keighley
I now did a first cut, using R only code,
(and realizing that most of chars8bit() should really happen in C).
I'm proposing to add something like this to R-devel in the near
future.
Note that AsciiToInt() and ichar() ar for S-plus and "old S"
compatibility, whereas I think we'd really want (equivalents) of
the three functions
digitsBase()
chars8bit()
strcodes()
in R eventually.
I'm very interested in feedback,
particularly
- function and arguments' naming
- proposals for improvements
- neat examples of usage.
Martin
Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
-------------- next part --------------
### This was digits.v() in library(sfsmisc):
### --> get it's help() file /u/maechler/R/Pkgs/sfsmisc/man/digits.Rd
digitsBase <- function(x, base = 2, ndigits = 1 + floor(log(max(x),base)))
{
## Purpose: Give the vector A of the base-_base_ representation of _n_:
## ------- n = sum_{k=0}^M A_{M-k} base ^ k , where M = length(a) - 1
## Value: MATRIX M where M[,i] corresponds to x[i]
## c( result ) then contains the blocks in proper order ..
## Author: Martin Maechler, Date: Wed Dec 4 14:10:27 1991
## ----------------------------------------------------------------
## Arguments: x: vector of non-negative integers
## base: Base for representation
## ndigits: Number of digits/bits to use
## EXAMPLE: digitsBase(1:24, 8) #-- octal representation
## ----------------------------------------------------------------
if(any((x <- as.integer(x)) < 0))
stop("`x' must be non-negative integers")
r <- matrix(0, nrow = ndigits, ncol = length(x))
if(ndigits >= 1) for (i in ndigits:1) {
r[i,] <- x %% base
if (i > 1) x <- x %/% base
}
r
}
### This is an improved version of make.ASCII() in 1991's ~/S/Good-string.S !
chars8bit <- function(i = 0:255)
{
## Purpose: Compute a character vector from its "ASCII" codes.
## We seem to have to use this complicated way thru text and parse.
## Author: Martin Maechler, Original date: Wed Dec 4, 1991
## ----------------------------------------------------------------
i <- as.integer(i)
if(any(i < 0 | i > 255)) stop("`i' must be in 0:255")
i8 <- apply(digitsBase(i, base = 8), 2, paste, collapse="")
c8 <- paste('"\\', i8, '"', sep="")
eval(parse(text = paste("c(",paste(c8, collapse=","),")", sep="")))
}
strcodes <- function(x, table = chars8bit(0:255))
{
## Purpose: R (code) implementation of old S's ichar()
## ----------------------------------------------------------------------
## Arguments: x: character vector
## ----------------------------------------------------------------------
## Author: Martin Maechler, Date: 23 Oct 2003, 12:42
lapply(strsplit(x, ""), match, table = table)
}
## S-PLUS has AsciiToInt() officially, and ichar() in library(examples):
AsciiToInt <- ichar <- function(strings) unname(unlist(strcodes(strings)))
## Examples:
all8bit <- chars8bit(0:255)
matrix(all8bit, 32, 8, byrow = TRUE)
x <- c(a = "abc", bb = "BLA & blu", Person = "M?chler, Z?rich")
strcodes(x)
AsciiToInt(x)
More information about the R-devel
mailing list