[Bioc-devel] Short URLs for packages?

Martin Morgan mtmorgan at fredhutch.org
Tue Mar 24 12:14:59 CET 2015


On 03/24/2015 02:31 AM, Wolfgang Huber wrote:
> Before we start a religious war, can we make progress on the pragmatic goal of making it possible to provide such URLs to people?
>
> There are two concepts
> - ‘the package' - a specific version, running in a specific environment, ‘frozen’, etc. (Gabe)
> - ‘the package’ - as a concept and a living artifact (me, Bernd, Tim)
> Both are useful. And having URLs for both would also be useful.

0. That's (mostly) satisfied with the current scheme and

   http://bioconductor.org/packages/3.0/bioc/html/BiocGenerics.html
   http://bioconductor.org/packages/release/bioc/html/BiocGenerics.html
   http://bioconductor.org/packages/devel/bioc/html/BiocGenerics.html

(hey, no www. -- that's four letters already! Perhaps importantly, there's also 
a hard-coded version for devel, 3.1, and for past releases. So as I understand 
it the request is for (a) shorter path names and (b) dynamic selection of 
release vs. devel, mentioned below, for the <6 month period when the package is 
in devel but not yet release. Also noted is Henrik's earlier proposal mentioned 
by Sean.


1. 'packages', 'bioc', 'html' all look somehow redundant, so

   http://bioconductor.org/release/BiocGenerics.html
   http://bioconductor.org/devel/BiocGenerics.html
   http://bioconductor.org/3.0/BiocGenerics.html

but also

   http://bioconductor.org/release/BiocGenerics/ (implicit index.html)
   http://bioconductor.org/BiocGenerics/release/

and their devel and version counterparts would seem quite possible / not 
profoundly controversial. Landing pages for specific versions  3.22.7 do not 
currently exist, change little across package minor versions, and would not lead 
to packages installable via biocLite(), so this idea of Tim's is a non-starter 
in my opinion.

Having the 'version' level of the path before the package provides a logical 
place to put biocViews for that release. I'd vote for one of the 
release/BiocGenerics[.html] schemes.


2. Something like

   http://bioconductor.org/BiocGenerics

redirecting to release when available, devel when newly added (Wolfgang's 
proposal) would in my opinion be confusing, especially since we continue to have 
so much difficulty with version mismatches in user installations. I don't think 
having a warning on redirect that 'this package is not available for release' 
would be effective either at advertising robust software or at enabling use by 
comparatively naive users.


3. In terms of the 'redundant' parts of the path, these are not completely 
arbitrary (not that these considerations have to dictate presentation; they do 
make one suspect that 'add a redirect and everything will be fine' will result 
in a nice plate of spaghetti, especially if there is some desire to retain 
backward compatibility).

'packages' separates the package repository from other aspects of 
bioconductor.org, and group related concepts ('package', 'help', etc.) at a 
similar hierarchical level.

'bioc' serves to distinguish between software ('bioc/'), annotation 
('data/annotation') and experiment data ('data/experiment') packages, and these 
divide the overall repository into three for the purposes of biocLite() / 
install.packages() (this conceptual distinction has been useful, I think).

 > biocinstallRepos()
                                               BioCsoft
            "http://bioconductor.org/packages/3.1/bioc"
                                                BioCann
"http://bioconductor.org/packages/3.1/data/annotation"
                                                BioCexp
"http://bioconductor.org/packages/3.1/data/experiment"

'html' distinguishes the landing pages from the package tar balls / binary 
distributions themselves as returned by contrib.url(biocinstallRepos()), and 
from their vignette/, man/ and news/ resources.


4. In terms of best practices, it seems like articles are about particular 
versions and should cite the package as such, for instance if only in devel when 
the paper is being written as .../3.1/..., but that there is no substantive cost 
to also referencing 'current version available [after April, 2015] at 
.../release/....


5. At the end of the day I find myself casting my lot for landing pages with the 
form

   http://bioconductor.org/release/BiocGenerics/

which leads to a little less typing but not the dynamic resolution that started 
this (version) of the thread.


Martin

>
> Wolfgang
>
>
>
>
>
>
>
>> On Mar 23, 2015, at 18:43 GMT+1, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
>>
>> I just meant that the mnemonic link
>>
>> http://www.bioconductor.org/limma/  (SEO version of limma ;-))
>>
>> could dump people at something like
>>
>> http://www.bioconductor.org/release/limma/3.22.7/   (I'd prefer this)
>>
>> or if need be for backwards compatibility,
>>
>> http://www.bioconductor.org/packages/3.0/limma/3.22.7/  (seems less good)
>>
>> instead of
>>
>> http://www.bioconductor.org/packages/3.0/bioc/html/limma.html  (current)
>>
>> and furthermore the specific version page could note more prominently that
>> the build of limma being referenced at that particular instance in time may
>> or may not be the same as was cited in a paper, used in an analysis,
>> available for download the previous evening, etc. thus citation("limma") is
>> a Very Good Idea when writing up results that depend upon it.  Because even
>> the WEHI guys could theoretically have a bug that impacted someone's
>> results (as opposed to the usual case of Didn't Read The Fine Limma Manual)
>>
>> Does that make more sense?  (Probably not, but worth a try)
>>
>> Statistics is the grammar of science.
>> Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>
>>
>> On Mon, Mar 23, 2015 at 9:29 AM, Dan Tenenbaum <dtenenba at fredhutch.org>
>> wrote:
>>
>>>
>>>
>>> On March 23, 2015 9:18:57 AM PDT, "Tim Triche, Jr." <tim.triche at gmail.com>
>>> wrote:
>>>>
>>>>> Packages are (read: should be, IMHO) published, citable pieces of
>>>> research, though. Imagine if a paper you cite were silently updated
>>>> without the doi/citation changing. That wouldn't be good
>>>>
>>>> I don't disagree, but the existing setup does nothing to address that.
>>>> Citation('limma'), for example, does.
>>>>
>>>> .../release/... and .../devel/... can change at any time, potentially
>>>> overnight (with or without a new BioC release).  The only real way to
>>>> cite an exact version is to cite that exact version, which is already
>>>> the proper way to do things and would remain unaffected by this, at
>>>> least AFAIK.
>>>>
>>>> Perhaps a useful addendum would be for the mnemonic
>>>>
>>>> http://bioconductor.org/limma
>>>>
>>>> To redirect to
>>>>
>>>>
>>> http://bioconductor.org/packages/limma/whateverTheMostRecentStableVersionMayBe/
>>>>
>>>> And then everything is explicit.
>>>>
>>>> Does that address the competing issues discussed herein?
>>>
>>>
>>> Note that 'release' and 'devel' are just symlinks to the current release
>>> and devel versions. I.e. currently 3.0 and 3.1 respectively. So you can
>>> always link directly to a specific version.
>>>
>>> Dan
>>>
>>>>
>>>> Best,
>>>>
>>>> --t
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list