[Rd] Re: ichar() function in R : 1st implementation, RFC

Martin Maechler maechler at stat.math.ethz.ch
Thu Oct 23 14:36:13 MEST 2003


(RFC := Request For Comments)

>>>>> "Tim" == Tim Keighley <Tim.Keighley at csiro.au>
>>>>>     on Thu, 23 Oct 2003 11:45:22 +1000 writes:

    Tim> Hi Martin,
    Tim> In October 2000 you wrote to r-help:

    >>> which reminds me that I've had a desire for something like
    >>> the old S function [from the blue book, and library(examples) I think]
    >>> ichar(ch)
    >>> which would return a vector of integers, each the (decimal) equivalent of
    >>> the (ISO-latin1) representation of the corresponding characters in ch.
    >>> 
    >>> This should be easy enough (and be done in C).
    >>> Any volunteers?

    Tim> Did you get any volunteers?
no.
Thank you for reminding me!

    Tim> Is this function or an
    Tim> equivalent now available in R? I have been unsuccessful
    Tim> in my investigation but I do not have all the available
    Tim> CRAN packages, so it might be in one of them.

I've searched myself (we have almost all of them installed), and
didn't find anything.

    Tim> Cheers,
    Tim> Tim Keighley

I now did a first cut, using R only code,
(and realizing that most of  chars8bit() should really happen in C).

I'm proposing to add something like this to R-devel in the near
future.
Note that  AsciiToInt() and ichar() ar for S-plus and "old S"
compatibility, whereas I think we'd really want (equivalents) of
the three functions
    digitsBase()
    chars8bit()
    strcodes()
in R eventually.  

I'm very interested in feedback, 
particularly
- function and arguments' naming
- proposals for improvements
- neat examples of usage.

Martin

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><

-------------- next part --------------

### This was  digits.v()  in library(sfsmisc):
### --> get it's help() file /u/maechler/R/Pkgs/sfsmisc/man/digits.Rd
digitsBase <- function(x, base = 2, ndigits = 1 + floor(log(max(x),base)))
{
    ## Purpose: Give the vector A of the base-_base_ representation of _n_:
    ## -------  n = sum_{k=0}^M  A_{M-k} base ^ k ,   where  M = length(a) - 1
    ## Value: MATRIX  M where  M[,i]  corresponds to  x[i]
    ##	c( result ) then contains the blocks in proper order ..
    ## Author: Martin Maechler, Date:  Wed Dec  4 14:10:27 1991
    ## ----------------------------------------------------------------
    ## Arguments: x: vector of non-negative integers
    ##	base: Base for representation
    ##	ndigits: Number of digits/bits to use
    ## EXAMPLE: digitsBase(1:24, 8) #-- octal representation
    ## ----------------------------------------------------------------
    if(any((x <- as.integer(x)) < 0))
        stop("`x' must be non-negative integers")
    r <- matrix(0, nrow = ndigits, ncol = length(x))
    if(ndigits >= 1) for (i in ndigits:1) {
        r[i,] <- x %% base
        if (i > 1) x <- x %/% base
    }
    r
}


### This is an improved version of  make.ASCII() in 1991's ~/S/Good-string.S !
chars8bit <- function(i = 0:255)
{
    ## Purpose: Compute a character vector from its "ASCII" codes.
    ## We seem to have to use this complicated way thru text and parse.

    ## Author: Martin Maechler, Original date: Wed Dec 4, 1991
    ## ----------------------------------------------------------------
    i <- as.integer(i)
    if(any(i < 0 | i > 255)) stop("`i' must be in 0:255")
    i8 <- apply(digitsBase(i, base = 8), 2, paste, collapse="")
    c8 <- paste('"\\', i8, '"', sep="")
    eval(parse(text = paste("c(",paste(c8, collapse=","),")", sep="")))
}

strcodes <- function(x, table = chars8bit(0:255))
{
    ## Purpose: R (code) implementation of old S's ichar()
    ## ----------------------------------------------------------------------
    ## Arguments: x: character vector
    ## ----------------------------------------------------------------------
    ## Author: Martin Maechler, Date: 23 Oct 2003, 12:42

    lapply(strsplit(x, ""), match, table = table)
}

## S-PLUS has  AsciiToInt() officially, and   ichar() in  library(examples):
AsciiToInt <- ichar <- function(strings) unname(unlist(strcodes(strings)))

## Examples:

all8bit <- chars8bit(0:255)
matrix(all8bit, 32, 8, byrow = TRUE)

x <- c(a = "abc", bb = "BLA & blu", Person = "M?chler, Z?rich")
strcodes(x)
AsciiToInt(x)


More information about the R-devel mailing list