[Rd] read.table() code fails outside of the utils package

Andrew Piskorski atp at piskorski.com
Mon Apr 21 17:53:17 CEST 2014


One of the great things about R is how readable and re-usable much of
its own implementation is.  If an R function doesn't do quite what you
want but is close, it is usually very easy to read its code and start
adapting that as the base for a modified version.

In the 2.x versions of R, that was the case with read.table().  It was
easy to experiment with its source code, as it all worked just fine
when run at the top level or from inside any other package.

In R 3.1.0, that is no longer true.  The read.table() source ONLY works
when run from inside the "utils" package.  The (only) culprit is this:

  .External(C_readtablehead, file, 1L, comment.char, blank.lines.skip, quote, sep, skipNul)

Older versions of read.table() instead did this, which ran fine from
any package; this entry point no longer exists:

  .Internal(readTableHead(file, nlines, comment.char, blank.lines.skip, quote, sep)) 

The C implementation of readTableHead is in utils.so, but the symbol
is marked as local.  I tried adding "attribute_visible" to its
function definition in "src/library/utils/src/io.c" and recompiling,
which DID make the symbol globally visible.  With that change, my own
C code works just fine when calling readTableHead.  But interestingly,
R code using .External() like this still fails:

   .External("readtablehead", ..., PACKAGE="utils") 
   Error: "readtablehead" not available for .External() for package "utils" 

Why is that?  Apparently the C symbol being visible isn't enough, but
what else is needed for .External() to work?
(Clearly there's something here about how R C programming works that I
don't understand.)

Finally, since it is generally useful to be able to experiment with
and re-use parts of the stock read.table() implementation, I suggest:

1. R add "attribute_visible" or otherwise make readtablehead callable
   from user C code.
2. R make readtablehead callable from user R code via .External().

What do you think?  Note that I'm not asking that the current
interface or behavior of readtablehead necessarily be SUPPORTED in any
way, just that it be callable for experimental purposes, much as the
old .Internal(readTableHead()) was in earlier versions of R.

-- 
Andrew Piskorski <atp at piskorski.com>



More information about the R-devel mailing list