[Bioc-devel] Large vignettes

Vincent Carey @tvjc @end|ng |rom ch@nn|ng@h@rv@rd@edu
Thu Jan 12 12:42:57 CET 2023

On Thu, Jan 12, 2023 at 5:29 AM Lluís Revilla <lluis.revilla using gmail.com>

> Hi all,
> Perhaps instead of long vignettes, it would be better to use a book hosted
> and in sync with the packages at Bioconductor.
> There are already a few: https://www.bioconductor.org/books/release/
> But I was not able to find how to submit such bookdowns to Bioconductor (I
> briefly searched the website and the dev book at
> https://contributions.bioconductor.org/docs.html?q=book).
> I think the limits are less restrictive and there is no minimum size of
> chapters or documentation, but I am not sure.

I like this idea -- but I want to say a few things, really just personal
observations, nothing "official", and there
may need to be corrections to some of these remarks.

First, the concept that the bioconductor build system could handle
monograph-size artifacts was improvised as the
OSCA book came into being.  There is interesting and intricate
infrastructure there supporting cross-referenced
computations to reduce redundant computation, but I don't think that has
become an "authoring standard" and
it requires some specialized knowledge to use.  Upshot -- we have some
attractive examples of monograph/book
artifacts but we don't really have a standard approach to guide authors to
efficiently deployable products, and the
tooling to build and check the monographs regularly is somewhat limited.

Second, a book becomes a sub-ecosystem, necessarily of both CRAN and
Bioconductor.  We want the
book to remain valid and computable at all times, certainly in the release
branch, but as packages on
which the book depends change and perhaps disappear (happens primarily with
CRAN) the book production
can fail.  Authors have to be vigilant and responsive to events of this

Third, the narrative of a book is synchronized with the computations when
it is authored, but underlying
software evolution can make prose statements in the book become false over
time.  We saw this with text describing
cluster identities in single-cell analysis ... when a certain projection
function in an upstream package was
modified, cluster labeling silently changed and the text became false.  We
want some kind of flagging
procedure that will alert us to changes of this sort.

There are technical responses to all of these observations but implementing
well-engineered solutions will
require more resources than we currently have.  The workshop authoring
method used in
https://github.com/seandavi/BuildABiocWorkshop is surely relevant; Alex
Mahmoud has a work
in progress called BiocDeployables that is also relevant.  Ultimately we
want to improve communication
of good analytic methods to the scientific community, and monograph-scale
resources are definitely
useful, but smaller-scale resources that don't require the technology of
package production can also
be valuable, and BiocDeployables goes in that direction.  Maintenance and
the avoidance of bit/doc rot
are first-class concerns and really require author commitment.

> Some authors already have books outside bioconductor to have extensive
> examples of their packages.
> They will also benefit from having them with the Bioconductor framework
> and in sync with the packages released to the users.
> Best,
> Lluís
> On Wed, 4 Jan 2023 at 21:39, Vincent Carey <stvjc using channing.harvard.edu>
> wrote:
>> I am glad you brought this up here, and I welcome further discussion on
>> this mailing list.  It is important to understand the constraints on
>> development
>> that arise from Bioconductor's package guidelines.
>> I don't think we want to change the limits on package payload size without
>> understanding the consequences for users and our build system.  The split
>> approach mentioned by Lambda seems sensible to me, and I hope it is
>> not too burdensome.  Additional commentary and details from the community
>> are welcome.
>> On Wed, Jan 4, 2023 at 3:21 PM Lambda Moses <dlu2 using caltech.edu> wrote:
>> > Hi Adam,
>> >
>> > I also got this problem, and I would like some input from Bioc Core
>> > Team. I worked around it by writing a minimal vignette in the main
>> > branch. Then I made a documentation branch, where I have the same code
>> > as in main branch, but with more elaborate vignettes used to build a
>> > pkgdown website. I made a rule for myself that I can only merge from the
>> > main or devel branch to the documentation branch but not the other way
>> > round. I would switch branch when I find a bug or want a new feature
>> > while writing the vignettes. You can see the main branch here:
>> > https://github.com/pachterlab/voyager/tree/main The documentation
>> branch
>> > here: https://github.com/pachterlab/voyager/tree/documentation
>> >
>> > I kind of wonder if the 5 MB rule is outdated in the age of increasing
>> > computer power and internet speed. A jpeg photo can easily exceed 5 MB.
>> > I also wonder if this rule is deliberately kept for good reasons, like
>> > to make R more inclusive to disadvantaged people with limited internet
>> > services.
>> >
>> > Regards,
>> >
>> > Lambda
>> >
>> > _______________________________________________
>> > Bioc-devel using r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>> --
>> The information in this e-mail is intended only for t...{{dropped:27}}

More information about the Bioc-devel mailing list