[R-pkg-devel] Questions about making a database package (Rpolyhedra)

Joris Meys Jori@@Mey@ @ending from ugent@be
Thu Jun 28 10:23:35 CEST 2018


Hi Ale,

I'd personally use a more specific solution like github LFS (large file
storage) for a versioned database. You should also check with CRAN itself,
as they keep high standards for everything that's not a standard install.
More specifically (from CRAN policies) :

Downloads of additional software or data as part of package installation or
startup should only use secure download mechanisms (e.g., ‘https’ or
‘ftps’).

Personally I would store that information in a public database somewhere
with a (minimal) API. This can then be extended without inflating the
download and would allow people to install only a subset of what they need.
That would also allow people to also port your work to other language by
simply writing a wrapper around the DB API. It's not a necessity, but I
thought it was worth mentioning as an option.

Cheers
Joris

On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek <
alejandrobaranek using gmail.com> wrote:

> By now, we are on that situation: +- 150 polyhedra published.
> But +800 able to publish and because of package size cannot publish all of
> them.
>
> It is not a problem on github, it's a problem on CRAN, with building (fixed
> testing timing with simple sample techniques) timing. I would like to hear
> more from experienced package developers about this issues, but we seemed
> to found a solution.
>
> We decided to make another github repo RpolyhedraDB. When you install the
> package, it downloads the database from the correct tag marked in the data
> folder of the package in a home directory of the user. So package will be
> minimal for CRAN, will be RR and will install database on first use (In
> case of TRAVIS or other qa/continuous integration, it will install it of
> course). It will be possible to setup different DB size using the TAGS, in
> case we find it preferable to the users.
>
>
> Best, Ale.
>
>
> 2018-03-29 4:43 GMT-03:00 Berry Boessenkool <berryboessenkool using hotmail.com>
> :
>
> >
> > I assume you cannot simply reduce the 150 to a few for demonstration
> > purposes?
> >
> >
> > I have seen people using DRAT packages on github for data, but gh is
> > limited in size restrictions as well...
> >
> >
> > No expert in this, but maybe this helps a little bit...
> >
> > Berry
> >
> >
> >
> > -
> >
> >
> >
> >
> >
> > ------------------------------
> > *From:* R-package-devel <r-package-devel-bounces using r-project.org> on
> behalf
> > of alejandro baranek <alejandrobaranek using gmail.com>
> > *Sent:* Tuesday, March 27, 2018 19:26
> > *To:* r-package-devel using r-project.org
> > *Subject:* [R-pkg-devel] Questions about making a database package
> > (Rpolyhedra)
> >
> > Hello group:
> >
> > We released Rpolyhedra V0.2 last month. It is able to scrape +800
> polyhedra
> > definitions from public sources. At V0.2.4 we are publishing only 150
> > because the time needed for scrape all the polyhedra, testing and the
> > resulting size of the package. The difference is a configuration in
> zzz.R,
> > very simple to change (Who wants to try it, can build the package for
> > themeselves)
> > Only the source files of polyhedra definitions are +12MB of size (We are
> > including it in the data folder for package self suficience).
> >
> > But we have doubts about good practices for publishing a database
> package.
> >
> > We think the solution is to split the package in an internal
> > Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a catalog
> > sewhich enables to connect with that repo for downloading scraped
> polyhedra
> > on-demand.
> >
> > We have to think further the way of connecting both repositories, but
> > before touching any code, want to listen to experienced package
> developers
> > and the community in general, about to do this.
> > Do you know any package with analog behavior than this package? We didn't
> > find it.
> >
> > Best, Ale.
> > --
> >  alejandro baranek
> > @ken4rab <https://twitter.com/ken4rab>
> > qbotics <http://qbotics.tumblr.com/> | surferinvaders
> > <http://surferinvaders.tumblr.com> | algebraic-soundscapes
> > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle
> > <http://imaginary.org/program/surfer-shuffle>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>
>
>
> --
>  alejandro baranek
> @ken4rab <https://twitter.com/ken4rab>
> qbotics <http://qbotics.tumblr.com/> | surferinvaders
> <http://surferinvaders.tumblr.com> | algebraic-soundscapes
> <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle
> <http://imaginary.org/program/surfer-shuffle>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>



-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

tel: +32 (0)9 264 61 79
-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list