[Bioc-devel] Package reference manuals in html

Andrzej Oleś andrzej.oles at gmail.com
Thu Mar 17 00:33:46 CET 2016


Hi all,

I had a discussion earlier today with Martin and Dan on providing online
man pages for Bioconductor packages. As we dived into implementation
details, it turned out that this idea is a little bit more complex and
resource-intensive than originally anticipated.

The main problem in generating man pages in a repository-wide fashion seems
to be the cross-linking of packages. Briefly, in order to generate the
links, apparently one needs to generate the html pages in an R installation
which is aware of the other packages. For example, the Rd macro
\linkS4class{ClassName} takes as argument only the class name, and the
corresponding package containing the class definition is "automagically"
resolved by R. I'm not sure how this could be done manually, on a
per-package basis. So by the end of the day, in order to generate static
man pages, we would need to maintain a complete BioC repo installation,
possibly on a system with the --enable-prebuilt-html configure option.
Unfortunately, it seems unfeasible to exploit the build servers for this,
as this would significantly increase the computational burden. This is
because currently only around 2/5 of all software and data packages are
actually being installed by the build system. The rest which does not have
any reverse dependencies is skipped. Installing the remaining 3/5 of
packages on a regular basis, not to mention the heavy annotation packages,
is a little bit of an overkill. So piggy-backing on the existing
infrastructure doesn't seem realistic.

On top of this, even if we would have access to a machine with a complete,
up-to-date BioC installation (maybe by just updating the packages after the
repo gets rebuild rather than re-installing them each time from scratch),
it remains an open question how "external" links to, let's say, CRAN
packages, or even base R packages, should be handled.

A lightweight and easy to implement alternative for those willing to share
self-hosted documentation of their packages, could be to provide in the
package DESCRIPTION file a "Documentation" field containing a link to
external resource, which would then appear on the package landing page next
to the vignettes and pdf manual. The obvious downsides of this solution
are: 1. no package cross-links, and 2. the burden of keeping the
documentation in sync with the package version on BioC would be in
maintainer's hands...

I will try to contact the authors of rdocumentation.org - maybe they have
some useful comments or even code which they would be willing to share. In
any case, it would be good to know what their experience is and why did
they stop maintaining their service. Maybe the BioC community could jump in
and help them to resolve the bottlenecks and keep the website up to date.

Cheers,
Andrzej


On Tue, Mar 8, 2016 at 4:36 PM, Andrzej Oleś <andrzej.oles at gmail.com> wrote:

> Hi Martin,
>
> thank you for your suggestions - I would be happy to contribute to this! I
> could help with developing the scripts for generating man pages, and
> integrating them with the website layout.
>
> As for rendering the man pages, I suggest that we try a similar approach
> to the one used by knitr::knit_rd() rather than plain tools::Rd2HTML(). It
> has the advantage that the examples are actually run, and the results, e.g.
> plots, are included in the output documents. I hope you can appreciate the
> added value by comparing the following man page rendered using
> tools::Rd2HTML() and knitr::knit_rd(), respectively.
> http://www.huber.embl.de/users/aoles/man/Image.html
> http://www.huber.embl.de/users/aoles/man/Image-knitr.html
> Regarding the additional dependencies: we kind of already rely on knitr
> when compiling vignettes, so this this shouldn't add much to the
> maintenance burden.
>
> Cheers,
> Andrzej
>
> On Fri, Mar 4, 2016 at 2:20 PM, Morgan, Martin <
> Martin.Morgan at roswellpark.org> wrote:
>
>> One thing about accessing the html versions locally (e.g., via ? with
>> options(help_type="html")] or help.start() or Rstudio) is that you get the
>> version relevant to your R / Bioconductor, rather than whatever is at the
>> top of google; I guess the same applies to the pdf versions, and the reason
>> that there isn't more current confusion is because the online pdf versions
>> are not as useful as the off-line help system.
>>
>> I think Laurent was interested in an integration of help pages across
>> packages (which is the appeal of rdocumentation.org?), not just
>> rendering the help pages in html rather than pdf? An integration of help
>> pages would definitely be a big job with substantial development and
>> maintenance; we will not be undertaking this ourselves.
>>
>> For the more limited case of adding a (directory of) html files for the
>> the manual, it's not impossible that we could find the resources to do this
>> in the next 6 months.
>>
>> One intermediate and helpful step for those willing to help would be to
>> develop the code to process help pages into a style consistent with the
>> bioconductor web site. One place where this could be implemented would be
>> the BiocStyle package (https://github.com/Bioconductor-mirror/BiocStyle
>> but hmm, seems like there's a slightly out of sync version at
>> https://github.com/Bioconductor/BiocStyle that would be more
>> convenient...). Perhaps this really means only developing a css style sheet
>> and R's tools::Rd2HTML() (I'm very reluctant to introduce dependencies into
>> the build system, and am very conservative about inclusion of fancy
>> features in the html -- these become significant maintenance burdens moving
>> forward).
>>
>> The web site is generated by
>> https://github.com/Bioconductor/bioconductor.org, with the style sheet
>> at
>> https://github.com/Bioconductor/bioconductor.org/blob/master/assets/style/bioconductor.css.
>> The package landing pages are templated using
>> layouts/_bioc_views_package_detail.html. The idea would be to end up with
>> layouts/_bioc_man_index.html and _bioc_man_body.html that wrapped output
>> from BiocStyle in the overall bioc page.
>>
>> The implementation suggestions above are just a sketch and could be quite
>> misguided. If there's interest then probably we should set up a hangout to
>> discuss in a little more detail.
>>
>> Martin
>>
>> ________________________________________
>> From: Bioc-devel <bioc-devel-bounces at r-project.org> on behalf of
>> Hartley, Stephen (NIH/NHGRI) [F] <stephen.hartley at nih.gov>
>> Sent: Wednesday, March 2, 2016 11:46 AM
>> To: Laurent Gatto; bioc-devel
>> Subject: Re: [Bioc-devel] Package reference manuals in html
>>
>> I'd like to second this. Currently Bioconductor hosts the pdf reference
>> manuals, but those are often sub-ideal. The page breaks make it harder to
>> read, the fixed width basically makes it either too small or too big
>> depending on your display, you can't navigate cross-package links, and in
>> general using paper-formatted software documentation is just poor form.
>>
>> Yihui, the creator of knitr, has a blog post where he shows how to do
>> this. There are a lot of ways to do this, and it's generally pretty
>> straightforward.
>> http://yihui.name/en/2012/10/build-static-html-help/
>>
>> You can also use a function in knitr, knit_rd(), which builds the
>> examples as well and inserts the output right onto the page. That's what I
>> used to make the docs for QoRTs (
>> http://hartleys.github.io/QoRTs/Rhtml/index.html) and JunctionSeq (
>> http://hartleys.github.io/JunctionSeq/Rhtml/index.html).
>>
>> Or you can use the staticdocs package, which does the same basic thing
>> but prettier (see ggplot2's docs: http://docs.ggplot2.org/current/)
>>
>> The nuclear option, of course, is to do what CRAN does and rebuild R on
>> (one of) the servers using the --enable-prebuilt-html configure option.
>> That might affect other things, though, and might not be ideal.
>>
>> Does any of this seem like a viable option for Bioconductor? I think it
>> could be an incredibly valuable resource for the community. Are there any
>> technical issues that haven't been considered in the above?
>>
>> Regards,
>> Steve Hartley
>>
>> -----Original Message-----
>> From: Laurent Gatto [mailto:lg390 at cam.ac.uk]
>> Sent: Tuesday, March 01, 2016 6:42 AM
>> To: bioc-devel
>> Subject: [Bioc-devel] Package reference manuals in html
>>
>>
>> Dear all,
>>
>> I find the http://www.rdocumentation.org/ site very useful to refer to
>> nicely formatted online man pages individually. Unfortunately, this
>> resource is terribly outdated and not maintained anymore.
>>
>> I was wondering if Bioconductor had any interest in serving an html
>> version of individual reference manuals in addition to the pdf that are
>> already available on the package landing pages.
>>
>> Is there anything I or any other members of the community could help with
>> to get this up and running?
>>
>> Best wishes,
>>
>> Laurent
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>> This email message may contain legally privileged and/or confidential
>> information.  If you are not the intended recipient(s), or the employee or
>> agent responsible for the delivery of this message to the intended
>> recipient(s), you are hereby notified that any disclosure, copying,
>> distribution, or use of this email message is prohibited.  If you have
>> received this message in error, please notify the sender immediately by
>> e-mail and delete this email message from your computer. Thank you.
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list