[Bioc-devel] splitting simpleSingleCell into self-contained vignettes

Aaron Lun alun at wehi.edu.au
Mon Dec 11 21:24:22 CET 2017


Thanks Val:

Obenchain, Valerie wrote:
> Hi,
>
> On 12/11/2017 08:49 AM, Aaron Lun wrote:
>> Following up on our earlier discussion:
>>
>> https://stat.ethz.ch/pipermail/bioc-devel/2017-October/011949.html
>>
>> I have split the simpleSingleCell workflow into three (four, if you 
>> include the introductory overview) self-contained Rmarkdown files. I am 
>> preparing them for submission to BioC's workflow builder, and I would 
>> like to check what is the best way to do this:
>>
>> i) Each workflow file goes into its own package.
>>
>> ii) All workflow files go into a single package.
>>
>> Option (i) is logistically easier but probably a bit odd conceptually, 
>> especially if users need to download "simpleSingleCell1", 
>> "simpleSingleCell2", "simpleSingleCell3", etc.
>> Option (ii) is nicer but requires more coordination, as the BioC webpage 
>> builder needs to know that that multiple HTMLs have been generated. It's 
>> also unclear to me whether this will run into problems with the DLL 
>> limit - does R restart when compiling each vignette?
> You could do either but I'd say option 2 is easier from a maintenance
> standpoint and probably for the user. Maybe you've seen this but an
> example is the annotation workflow package which houses 2 workflows:
>
> ~/repos/svn/workflows >ls annotation/vignettes/
> Annotating_Genomic_Ranges.Rmd  Annotation_Resources.Rmd 
> databaseTypes.png  display.png
>
> Each has an informative name and is presented on the website as an
> individual workflow:
>
> https://bioconductor.org/help/workflows/

I didn't know that, thanks.

> I don't think more coordination is involved - you just have multiple
> files in vignettes/. And, as you mentioned, it's a bonus that when a
> user downloads the annotation package they get all related workflows.
>
> A fresh R session is started for each package but not for each
> vignette in the package.

Ah. That's a shame, I was hoping to reduce the sensitivity to the DLL limit.

But now that I think about it: maybe that's not actually a problem,
provided the BioC workflow builders have a high DLL limit. The main
issue was that *users* were running into the DLL limit; by splitting the
workflow up, users should no be tempted to run everything at once, thus
avoiding the limit on their machines. Of course, Bioconductor can
control its own build machines, so as long as they set the MAX_DLLs
high, it should still build and show up on the website.

>> Any thoughts would be appreciated. I'm also happy to be a guinea pig for 
>> any SVN->Git transition for the workflow packages, if that's on the radar.
>
> Nitesh has created git repos for the workflow packages and Andrzej is
> adapting the BBS code to incorporate them into the builds. We
> guesstimate this will be done by the end of the year. You shouldn't
> have to do anything on your end - once we're ready to switch over
> we'll let you know and send the new location of the workflow in git.

Cool, looking forward to it.

-A

> Val
>> Cheers,
>>
>> Aaron
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee
> or agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately
> by e-mail and delete this email message from your computer. Thank you. 



More information about the Bioc-devel mailing list