[R-pkg-devel] Questions about making a database package (Rpolyhedra)

alejandro baranek @lej@ndrob@r@nek @ending from gm@il@com
Thu Jun 28 13:56:15 CEST 2018


Hi Joris:

Thank you for your comments.
Of course, we are using https for aditional downloads.

For the moment it is not needed to use github LFS, but is an alternative we
can explore after this short step: our immediate goal is to make the
package lighter in CRAN. Now it's 35kb so I think we made it well.

We are defining an XSD for exporting polyhedra in XML. After that, it will
be possible to make an API with the polyhedra database and make the
improvement you are saying. But with time, we have no funding yet for this
project and want to implement some functionalities to make it more valuable
first. But is in our roadmap to make it easy to port it to other languages.
The interface we are using is really simple, probably it will be the API
interface too.

Best, Ale.


2018-06-28 5:23 GMT-03:00 Joris Meys <Joris.Meys using ugent.be>:

> Hi Ale,
>
> I'd personally use a more specific solution like github LFS (large file
> storage) for a versioned database. You should also check with CRAN itself,
> as they keep high standards for everything that's not a standard install.
> More specifically (from CRAN policies) :
>
> Downloads of additional software or data as part of package installation
> or startup should only use secure download mechanisms (e.g., ‘https’ or
> ‘ftps’).
>
> Personally I would store that information in a public database somewhere
> with a (minimal) API. This can then be extended without inflating the
> download and would allow people to install only a subset of what they need.
> That would also allow people to also port your work to other language by
> simply writing a wrapper around the DB API. It's not a necessity, but I
> thought it was worth mentioning as an option.
>
> Cheers
> Joris
>
> On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek <
> alejandrobaranek using gmail.com> wrote:
>
>> By now, we are on that situation: +- 150 polyhedra published.
>> But +800 able to publish and because of package size cannot publish all of
>> them.
>>
>> It is not a problem on github, it's a problem on CRAN, with building
>> (fixed
>> testing timing with simple sample techniques) timing. I would like to hear
>> more from experienced package developers about this issues, but we seemed
>> to found a solution.
>>
>> We decided to make another github repo RpolyhedraDB. When you install the
>> package, it downloads the database from the correct tag marked in the data
>> folder of the package in a home directory of the user. So package will be
>> minimal for CRAN, will be RR and will install database on first use (In
>> case of TRAVIS or other qa/continuous integration, it will install it of
>> course). It will be possible to setup different DB size using the TAGS, in
>> case we find it preferable to the users.
>>
>>
>> Best, Ale.
>>
>>
>> 2018-03-29 4:43 GMT-03:00 Berry Boessenkool <berryboessenkool using hotmail.com
>> >:
>>
>> >
>> > I assume you cannot simply reduce the 150 to a few for demonstration
>> > purposes?
>> >
>> >
>> > I have seen people using DRAT packages on github for data, but gh is
>> > limited in size restrictions as well...
>> >
>> >
>> > No expert in this, but maybe this helps a little bit...
>> >
>> > Berry
>> >
>> >
>> >
>> > -
>> >
>> >
>> >
>> >
>> >
>> > ------------------------------
>> > *From:* R-package-devel <r-package-devel-bounces using r-project.org> on
>> behalf
>> > of alejandro baranek <alejandrobaranek using gmail.com>
>> > *Sent:* Tuesday, March 27, 2018 19:26
>> > *To:* r-package-devel using r-project.org
>> > *Subject:* [R-pkg-devel] Questions about making a database package
>>
>> > (Rpolyhedra)
>> >
>> > Hello group:
>> >
>> > We released Rpolyhedra V0.2 last month. It is able to scrape +800
>> polyhedra
>> > definitions from public sources. At V0.2.4 we are publishing only 150
>> > because the time needed for scrape all the polyhedra, testing and the
>> > resulting size of the package. The difference is a configuration in
>> zzz.R,
>> > very simple to change (Who wants to try it, can build the package for
>> > themeselves)
>> > Only the source files of polyhedra definitions are +12MB of size (We are
>> > including it in the data folder for package self suficience).
>> >
>> > But we have doubts about good practices for publishing a database
>> package.
>> >
>> > We think the solution is to split the package in an internal
>> > Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a
>> catalog
>> > sewhich enables to connect with that repo for downloading scraped
>> polyhedra
>> > on-demand.
>> >
>> > We have to think further the way of connecting both repositories, but
>> > before touching any code, want to listen to experienced package
>> developers
>> > and the community in general, about to do this.
>> > Do you know any package with analog behavior than this package? We
>> didn't
>> > find it.
>> >
>> > Best, Ale.
>> > --
>> >  alejandro baranek
>> > @ken4rab <https://twitter.com/ken4rab>
>> > qbotics <http://qbotics.tumblr.com/> | surferinvaders
>> > <http://surferinvaders.tumblr.com> | algebraic-soundscapes
>> > <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle
>> > <http://imaginary.org/program/surfer-shuffle>
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-package-devel using r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> >
>>
>>
>>
>> --
>>  alejandro baranek
>> @ken4rab <https://twitter.com/ken4rab>
>> qbotics <http://qbotics.tumblr.com/> | surferinvaders
>> <http://surferinvaders.tumblr.com> | algebraic-soundscapes
>> <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle
>> <http://imaginary.org/program/surfer-shuffle>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> tel: +32 (0)9 264 61 79
> -----------
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>



-- 
 alejandro baranek
@ken4rab <https://twitter.com/ken4rab>
qbotics <http://qbotics.tumblr.com/> | surferinvaders
<http://surferinvaders.tumblr.com> | algebraic-soundscapes
<http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle
<http://imaginary.org/program/surfer-shuffle>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list