[Rd] organisation of packages & CRAN

Ben Bolker bbolker at gmail.com
Sun Nov 9 21:26:49 CET 2014


Gábor Csárdi <csardi.gabor <at> gmail.com> writes:

> 
> Hi,
> 
> I think much of this is simply impossible to do. CRAN packages are
> written and maintained by thousands of people, how are you planning to
> convince them to reorganize their packages? Or even just rename them?
> This obviously won't happen.
> 
> Btw. did you see 'CRAN Task Views'? That is one organizations of
> packages into topics.
> 
> Personally, I don't think organization is the solution here. It is too
> costly (i.e. too much work) to maintain, impossible to enforce. I
> think, however, that a good search engine would definitely help.
> 
> FWIW there is a simple search engine here: http://metacran.github.io/search/
> This ranks packages according to the number of reverse dependencies
> (among other things), i.e. packages more often used by other packages
> will be higher up in the list.
> 
> Ranking them according to downloads is also possible, but AFAIK only
> one CRAN mirror gives out statistics about downloads, so you don't
> really have the complete numbers there.
> 
> Disclaimer: I built the search engine above. There are obviously other
> alternatives as well, e.g. http://rdocumentation.org, and
> http://mran.revolutionanalytics.com/packages/ are the two I know.
> 
> Gabor

  A few more thoughts:

* similar topics have been discussed _many_ times over the years on
the R mailing lists (sorry, I can't point you to any specific
threads). So far the R core/CRAN team have not indicated any interest
in making changes in the directions you suggest, so it's up to
the community to implement the things it would like to see.  There's
nothing stopping you from mirroring CRAN packages in any way you'd
like (e.g. see Revolution R's 'MRAN': http://mran.revolutionanalytics.com/ ,
which among other things allows you to sort packages by task view).

In addition to the Task Views pointed out by Gabor (you may enjoy
this version: http://www.maths.lancs.ac.uk/~rowlings/R/TaskViews/ ),
there have been a variety of individual/community attempts to provide
more package information:

* CRANberries http://dirk.eddelbuettel.com/cranberries/ gives a feed
about package changes
* CRANtastic http://crantastic.org/ attempted to set up a community
site for package rating/voting (never got a lot of traction though).
* download information _is_ available, unofficially, from some 
mirrors other than the RStudio mirror: see
http://www.rpubs.com/bbolker/3750

Questions:

* how would you propose to enforce package naming? (One of the
great things about packaging code R is the relatively *low*
barriers to entry ... but that has obvious disadvantages ...)
* who's going to enforce and curate the metadata?
* who's going to decide on the criteria for CRAN package removal
(i.e. how to determine quality, or how to decide on a threshold
for removal?) There's some filtering based on packages failing
their automated checks and being archived as R advances ...
 
> On Sun, Nov 9, 2014 at 11:24 AM, Steven Sagaert
> <steven.sagaert <at> gmail.com> wrote:
> > Hi,

> > I’ve been using R on and off for a couple of years. I think R is
> pretty great but one thing I’d like to see improved is the way
> packages are organised. Instead of CRAN being a long list of
> packages having a short & usually unintelligible name I ‘d like to
> see packages organised in a hierarchical way with that path acting
> as a hierarchical namespace just like you have in many other
> languages like Java, C#,Scala,… The names of the (sub)packages
> should also be clear and unambiguous & packages should be organised
> according to their functionality and not just for example be code
> for a whole book thrown together and given a cryptic name.

> Next to that it would be nice to have extra metadata in the
> packages to allow for another more loose flat multi-class
> class-action like in tagging blog systems & other metadata to allow
> for for automatically generating something like task views.

> > Due to the large number of packages it’s hard to see the forest
> from the trees so a recommendation system for CRAN based on
> popularity (download statistics) , ratings & other data like related
> packages from package metadata would be most welcome.


>  Finally the number of packages in CRAN is exponentially growing but
> there is also a large partial overlap in functionality between
> packages & so many packages make it hard to find what you are
> looking for. So maybe there less is more and there should be a
> system of removing hardly used/low quality packages on a regular
> basis.



More information about the R-devel mailing list