[Rd] Runnable R packages
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Thu Jan 31 16:26:00 CET 2019
On 31/01/2019 9:32 a.m., David Lindelof wrote:
> Belated thanks to all who replied to my initial query. In summary, three
> approaches have been mentioned to run R code "in production": 1)
> ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
> Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
> based on Rscript or littler, mentioned by Dirk.
>
> I can't speak to 1) because I don't currently use Shiny. And it seems to me
> that Docker-like solutions will still need some "point of entry" for the R
> application, which will have to be Rscript or littler.
>
> In my first email, I observed that Rscript expects a single expression or a
> single script, which is probably why (in my experience) many data
> scientists tend to provide their code in a very limited number of files.
> Gergely disagreed, arguing to the contrary that data scientists are
> encouraged to provide their application as an R package called by a short
> script executed by Rscript. But this doesn't happen where I work for
> several reasons:
>
> - it implies installing your package on the production machine(s),
> including its dependencies, which must be done by hand
> - some machine learning platforms will simply not accept code provided
> as an R package
> - we have some "big data" use cases for which we need Spark; Spark can
> run R or Python code, but only when it is provided as a single file. (On
> the other hand, Spark can run applications provided as JAR files)
>
> In summary, I'm convinced R would benefit from something similar to Java's
> `Main-Class` header or Python's `__main__()` function. A new R CMD command
> would take a package, install its dependencies, and run its "main"
> function. If we have this machinery available, we could even consider
> reaching out to Spark (and other tech stacks) developers and make it easier
> to develop R applications for those platforms.
>
> A candid comment from Dirk suggested that I should implement this myself,
> which I would be happy to do, provided this is the normal procedure. Or is
> there a more formal process I should follow?
You can't implement it to run under R CMD, but it should be
straightforward to put this in an R package, to be run by Rscript using
something like
Rscript -e "yourpackage::run_main('somepackage')"
You can use the installation code from the `remotes` package, so
run_main() could be a pretty simple function.
Duncan Murdoch
More information about the R-devel
mailing list