[R-pkg-devel] Best practice for accessing fat JAR Java application in R package

Leifeld, Philip ph|||p@|e||e|d @end|ng |rom e@@ex@@c@uk
Sat May 1 11:18:23 CEST 2021


Dear all,

I wrote an open-source Java software, which I release as a stand-alone 
JAR file. I also wrote an R package, which adds functionality and some 
wrapper functions for the JAR for use without GUI from R. It uses rJava 
for this purpose. So far, I released both the JAR and the R package on 
GitHub and included a function in the R package that would allow the 
user to download the latest JAR and store it in the package installation 
directory under java/.

But this causes unreliability for several reasons. GitHub allows only so 
many downloads per time unit, which occasionally messes up testing and 
end user experience; users sometimes end up with incompatible versions; 
it adds an inconvenient step for the user etc. I was thinking it may be 
more reliable to bundle the two pieces together by including the JAR 
file in the /inst/java directory of the package sources. This would also 
allow me to initialize Java with the JAR file upon loading the package 
without asking the user to download or initialize anything.

However, the JAR file depends on other open-source software in the form 
of JAR files packaged into my JAR file (so my JAR file is a "fat" JAR 
file). For example, I access SQLite databases as part of my Java 
program, and for this I need the SQLite JDBC JAR, which has about 7MB. 
This leads to a note during R CMD check, which, I guess, will prevent 
CRAN submission:

N  checking installed package size ...
      installed size is  9.1Mb
      sub-directories of 1Mb or more:
        java   8.1Mb

As an alternative solution, I wrote a function that would download the 
correct version of the JAR file and store it in inst/java. This function 
would be executed in R/zzz.R during installation. I was hoping the 
freshly downloaded JAR file would then be copied from inst/java to java/ 
in the installation directory of the package in the library path, but 
this does not seem to be the case. Even if it were copied to the desired 
location, the installed package size would still be too large and would 
yield a note.

I would be grateful if somebody could suggest a best practice to deal 
with this problem. Should I aim to include the JAR or store it online? 
If include, how do I deal with the size limitation of the R package for 
CRAN? If store online, how do I deal with the unreliability of GitHub 
and other issues?

Many thanks in advance,

Philip

-- 
Philip Leifeld
Professor, Department of Government
University of Essex
http://www.philipleifeld.com


More information about the R-package-devel mailing list