[Rd] dict package: dictionary data structure for R
Duncan Temple Lang
duncan at wald.ucdavis.edu
Mon Jul 23 13:11:44 CEST 2007
Hi Seth.
Glad you did this. As you know, I think we need more specialized
data structures and the ability to be able to introduce them easily
into R computations, both internally and at the R language-level.
A few things that come to mind after a quick initial look.
The HashFunc typedef in hashfuncs.h would be more flexible if it
took an additional argument of type void * to allow for user
defined data. Alternatively, it might take the hash table
object itself. The function might want to do some
updating of the table itself, or look at some table (e.g. for perfect
hashing). And if we had a place to provide additional information, it
is easy to allow the hash function object to be an R function.
Also, you are using a "global" table of hash functions (i.e.
Dict_HashFunctions) and looking up the C routine using GET_HASHFUN
which is tied to the integer indexing for this global table.
Why not use the C routines directly from R, i.e. using
getNativeSymbolInfo and pass this from R to the newly created
dict. This avoids the lookup, the global table and makes things
extensible with routines in packages and simply extends to allowing
R functions to be passed instead of C routines.
It also removes the need to synchronize the labeling system in
R and in C, i.e. that 0L corresponds to PJW. The reliance on
synchronized names rather than direct handles is unnecessary
although widely used in S/R code.
I'm more than happy to give some code to illustrate what I mean
more precisely if you'd like it.
D.
Seth Falcon wrote:
> Hi all,
>
> The dict package provides a dictionary (hashtable) data
> structure much like R's built-in environment objects, but with the
> following differences:
>
> - The Dict class can be subclassed.
>
> - Four different hashing functions are implemented and the user can
> specify which to use when creating an instance.
>
> I'm sending this here as opposed to R-packages because this package
> will only be of interest to developers and because I'd like to get
> feedback from a slightly smaller community before either putting it on
> CRAN or retiring it to /dev/null.
>
> The design makes it fairly easy to add additional hashing functions,
> although currently this must be done in C. If nothing else, this
> package should be useful for evaluating hashing functions (see the
> vignette for some examples).
>
> Source:
> R-2.6.x: http://userprimary.net/software/dict_0.1.0.tar.gz
> R-2.5.x: http://userprimary.net/software/dict_0.0.4.tar.gz
>
> Windows binary:
> R-2.5.x: http://userprimary.net/software/dict_0.0.4.zip
>
>
> + seth
>
More information about the R-devel
mailing list