[Rd] RFC: API design of package "modules"

Mon Apr 28 15:55:26 CEST 2014

Some time ago I’ve published the first draft of the package “modules”
[1] which aims to provide a module system as an alternative to
packages for R. Very briefly, this is aimed to complement the existing
package system for very small code units which do not require the
(small, but existing) overhead associated with writing a package. I’ve
noticed that people around me put off writing packages (and thus,
reusable code) due to that, and use `source` instead. Modules would
work (in many cases) as a drop-in replacement for `source`, and could
thus encourage code reuse.

However, now I’m stuck on a particular aspect of the API and would
like to solicit feedback from r-devel.

`import('foo')` imports a given module, `foo`. In addition to other
differences detailed in [2], modules allow/impose a hierarchical
organisation. That way, `import('foo')` might load code from a file
called `foo.r` or from a file called `foo/__init__.r` (reminiscent of
Python’s module mechanism) and `import('foo/bar')` would load a file
`foo/bar.r` or `foo/bar/__init__r.` [3].

`import` also allows selectively importing only some functions, so
that a user might write `import('foo', c('f', 'g'))` to only import
the functions `f` and `g`.

However, at the moment modules don’t allow the equivalent of Python’s
`from foo import bar` for nested modules. That is, if I have two
nested modules `bar` and `baz`, I cannot import both of them in one
`import` statement, I need two (`import('foo/bar');
import('foo/baz')`).

I would like feedback on what people think is the best way of solving
this. Here are some suggestions I’ve gathered; in the following,
`foo`, `bar`, `qux` are (sub)modules. `f1`, `b1`, `b2`, `q1` … are
functions within the modules whose name starts with the same letter:

(1) Use of Bash-like wildcards to specify which modules to import:

```
foo = import('foo')
# Exposes `foo$f1`, `foo$f2` …, but no submodules

bar = import('foo/bar')
# Exposes `bar$b1`, `bar$b2`

foo = import('foo/{bar,qux}')
# Exposes `foo$f1`, `foo$bar$b1`, `foo$bar$b2`, `foo$qux$q1` etc.

foo = import('foo/*')
# Exposes everything

# Specifying which functions to import:
foo = import('foo/{bar,baz}', c('bar$b1', qux$q1'))
# Exposes `foo$bar$b1`, `foo$qux$q1` but NOT `foo$f1`, `foo$bar$b2` etc.
```

This is straightforward, but I feel vaguely that it’s too stringly
typed [4]. A colleague dislikes this proposal because it treats nested
modules and functions unequal: as mentioned above, `import('foo',
'f')` will import only `f` from `foo`. His argument is that there
should be a uniform way of specifying which nested modules or
functions to import – somewhat analogously to Python’s mechanism,
where `from a import b` might import a submodule *or* an object `b`.

(2) Treat submodules and functions uniformly, one per argument:

```
foo = import('foo')
# Exposes `foo$f1`, `foo$f2` …, but no submodules

bar = import('foo/bar')
# Exposes `bar$b1`, `bar$b2`

foo = import('foo/f1', 'foo/bar', 'foo/qux/q1')
# Exposes `foo$f1`, `foo$bar$b1`, `foo$bar$b2`, `foo$qux$q1`.
```

However, this has the disadvantage of cramming even more functionality
into the first argument and using stringly typing for everything
instead of using “proper” function arguments.

(3) Drop the whole thing, force people to use a separate `import`
statement for every submodule (.NET does this for namespace imports,
but then, .NET’s namespaces don’t implement a module system):

```
foo = import('foo')
# Exposes `foo$f1`, `foo$f2` …, but no submodules

bar = import('foo/bar')
# Exposes `bar$b1`, `bar$b2`

foo = import('foo', 'f1')
# Exposes `foo$f1`

bar = import('foo/bar')
# Exposes `bar$b1`, `bar$b2` …
```

(4) Something else?

So this is my question: what do other people think? Which is the most
useful and least confusing alternative from the users’ perspective?

[1]: https://github.com/klmr/modules
[2]: https://github.com/klmr/modules/blob/master/README.md#feature-comparison
[3] The original syntax for this was `import(foo)` and
`import(foo.bar)`, respectively, but Hadley convinced me to drop
non-standard argument evaluation. I’m still not convinced that NSE is
actually harmful here, but I’m likewise not convinced that it’s
beneficial (although I personally like it in this case).
[4]: http://c2.com/cgi/wiki?StringlyTyped

Kind regards,
Konrad