[Bioc-devel] splitting simpleSingleCell into self-contained vignettes

Aaron Lun Aaron.Lun at cruk.cam.ac.uk
Tue Dec 12 19:19:09 CET 2017


The split-up workflows seem to have built successfully:

http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/

Is there something I have to do to get a blurb specific to each 
vignette, as observed for "Annotation_Resources" vs 
"Annotating_Genomic_Ranges"?

The various vignettes are ordered pedagogically, so the order in which 
they are presented in the workflow page might require some manual 
specification. It would also be nice if the multiple simpleSingleCell 
workflows are grouped together, to avoid being intermingled with other 
workflows on the page.

Finally, could we get a separate "single-cell workflows" section? The 
current "Basic/Advanced" partition is pretty crude, and I can see 
opportunities for more detailed stratification, e.g., by ChIP-seq, 
RNA-seq, single-cell RNA-seq, proteomics (including mass cytometry).

Cheers,

Aaron


On 11/12/17 20:24, Aaron Lun wrote:
> Thanks Val:
> 
> Obenchain, Valerie wrote:
>> Hi,
>>
>> On 12/11/2017 08:49 AM, Aaron Lun wrote:
>>> Following up on our earlier discussion:
>>>
>>> https://stat.ethz.ch/pipermail/bioc-devel/2017-October/011949.html
>>>
>>> I have split the simpleSingleCell workflow into three (four, if you
>>> include the introductory overview) self-contained Rmarkdown files. I am
>>> preparing them for submission to BioC's workflow builder, and I would
>>> like to check what is the best way to do this:
>>>
>>> i) Each workflow file goes into its own package.
>>>
>>> ii) All workflow files go into a single package.
>>>
>>> Option (i) is logistically easier but probably a bit odd conceptually,
>>> especially if users need to download "simpleSingleCell1",
>>> "simpleSingleCell2", "simpleSingleCell3", etc.
>>> Option (ii) is nicer but requires more coordination, as the BioC webpage
>>> builder needs to know that that multiple HTMLs have been generated. It's
>>> also unclear to me whether this will run into problems with the DLL
>>> limit - does R restart when compiling each vignette?
>> You could do either but I'd say option 2 is easier from a maintenance
>> standpoint and probably for the user. Maybe you've seen this but an
>> example is the annotation workflow package which houses 2 workflows:
>>
>> ~/repos/svn/workflows >ls annotation/vignettes/
>> Annotating_Genomic_Ranges.Rmd  Annotation_Resources.Rmd
>> databaseTypes.png  display.png
>>
>> Each has an informative name and is presented on the website as an
>> individual workflow:
>>
>> https://bioconductor.org/help/workflows/
> 
> I didn't know that, thanks.
> 
>> I don't think more coordination is involved - you just have multiple
>> files in vignettes/. And, as you mentioned, it's a bonus that when a
>> user downloads the annotation package they get all related workflows.
>>
>> A fresh R session is started for each package but not for each
>> vignette in the package.
> 
> Ah. That's a shame, I was hoping to reduce the sensitivity to the DLL limit.
> 
> But now that I think about it: maybe that's not actually a problem,
> provided the BioC workflow builders have a high DLL limit. The main
> issue was that *users* were running into the DLL limit; by splitting the
> workflow up, users should no be tempted to run everything at once, thus
> avoiding the limit on their machines. Of course, Bioconductor can
> control its own build machines, so as long as they set the MAX_DLLs
> high, it should still build and show up on the website.
> 
>>> Any thoughts would be appreciated. I'm also happy to be a guinea pig for
>>> any SVN->Git transition for the workflow packages, if that's on the radar.
>>
>> Nitesh has created git repos for the workflow packages and Andrzej is
>> adapting the BBS code to incorporate them into the builds. We
>> guesstimate this will be done by the end of the year. You shouldn't
>> have to do anything on your end - once we're ready to switch over
>> we'll let you know and send the new location of the workflow in git.
> 
> Cool, looking forward to it.
> 
> -A
> 
>> Val
>>> Cheers,
>>>
>>> Aaron
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>> This email message may contain legally privileged and/or confidential
>> information. If you are not the intended recipient(s), or the employee
>> or agent responsible for the delivery of this message to the intended
>> recipient(s), you are hereby notified that any disclosure, copying,
>> distribution, or use of this email message is prohibited. If you have
>> received this message in error, please notify the sender immediately
>> by e-mail and delete this email message from your computer. Thank you.
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 

-- 
Aaron Lun
Research Associate, CRUK Cambridge Institute
University of Cambridge


More information about the Bioc-devel mailing list