[Bioc-devel] workflow building issues on mac and windows
Leonardo Collado Torres
lcollado at jhu.edu
Tue Sep 5 16:35:38 CEST 2017
Hi Andrzej,
Thank you for replies and looking into this!
Thank you for the detailed explanation about the beta workflow build
system and the use of rmarkdown::render with a custom template. Does
it support \@ref() calls?
Now that the Windows build is solved, I believe that the Mac one is
related to https://support.bioconductor.org/p/93182/. That is, it
could be due to:
openssl version
pkg-config --version
Recent versions are required by rtracklayer to support url forwarding.
We use links like
http://duffel.rail.bio/recount/SRP045638/bw/mean_SRP045638.bw in
recount that forward the link to IDIES in case that we need to move
the data at some point in the future.
As for the size of the BigWig file, don't worry about it. We use a
small portion of it (chr21) for the workflow so the 2GB RAM limit
shouldn't be an issue. The recount::expressed_regions() call
ultimately calls:
rtracklayer::import.bw("http://duffel.rail.bio/recount/SRP045638/bw/mean_SRP045638.bw",
selection = GenomicRanges::GRanges("chr21", IRanges::IRanges(1,
46709983)), as = "RleList")[["chr21"]]
> x <- rtracklayer::import.bw("http://duffel.rail.bio/recount/SRP045638/bw/mean_SRP045638.bw", selection = GenomicRanges::GRanges("chr21", IRanges::IRanges(1, 46709983)), as = "RleList")[["chr21"]]
> x
numeric-Rle of length 46709983 with 8092519 runs
Lengths: 5010010 66 32
2 ... 6 65
6 10025
Values : 0 0.00222299993038177 0.00444599986076355
0.00666899979114532 ... 0.00606400007382035 0.00828699953854084
0.00606400007382035 0
> print(object.size(x), units = "Mb")
92.6 Mb
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] lattice_0.20-35 matrixStats_0.52.2
IRanges_2.10.2 XML_3.98-1.9
[5] Rsamtools_1.28.0 Biostrings_2.44.2
GenomicAlignments_1.12.2 bitops_1.0-6
[9] grid_3.4.1 GenomeInfoDb_1.12.2
stats4_3.4.1 zlibbioc_1.22.0
[13] XVector_0.16.0 S4Vectors_0.14.3
Matrix_1.2-11 BiocParallel_1.10.1
[17] tools_3.4.1 Biobase_2.36.2
RCurl_1.95-4.8 DelayedArray_0.2.7
[21] rtracklayer_1.36.4 parallel_3.4.1
compiler_3.4.1 BiocGenerics_0.22.0
[25] GenomicRanges_1.28.4 SummarizedExperiment_1.6.3
GenomeInfoDbData_0.99.0
Best,
Leo
On Fri, Sep 1, 2017 at 10:54 AM, Andrzej Oleś <andrzej.oles at gmail.com> wrote:
> I've updated pandoc on the win workflow builder to the latest release
> (1.19.2.1) which resolves the vignette compilation error there.
>
> Regarding the Mac builder the workflow fails in the call to
> derfinder::loadCoverage issued by the line
>
> regions <- expressed_regions("SRP045638", "chr21", cutoff = 5L,
> maxClusterGap = 3000L)
>
> 2017-09-01 07:32:44 loadCoverage: loading BigWig file
> http://duffel.rail.bio/recount/SRP045638/bw/mean_SRP045638.bw
> Error in .local(con, format, text, ...) : UCSC library operation failed
> In addition: Warning message:
> In .local(con, format, text, ...) : End of file reading 4096 bytes (got 0)
>
> I didn't look into what loadCoverage does internally, but maybe the error
> has something to do with the fact that the size of mean_SRP045638.bw is 7GB,
> while the builder vagrant box runs only on 2GB(!) of RAM.
>
> Best,
> Andrzej
>
>
> On Thu, Aug 31, 2017 at 11:25 PM, Andrzej Oleś <andrzej.oles at gmail.com>
> wrote:
>>
>> Hi Leo,
>>
>> first of all, many thanks for your efforts in troubleshooting workflow
>> build issues and the detailed description of your findings. Also,
>> congratulations on the successful publication of your new workflow on the
>> Bioconductor website!
>>
>> We are now beta testing a new workflow build engine which enables some
>> features not possible to achieve in the previous approach, such as
>> cross-references. The goal is to properly render the output for the BioC
>> website directly from the Rmd file authored for
>> BiocWorkflowTools::f1000_article without any special tweaks or hacks. This
>> new engine is enabled by adding the special '.html_output' token file in the
>> vignettes/ dir. Ideally, you should be able to get decent output from the
>> original Rmd file which you used for the F1000 Research submission without
>> any manual Bioc-specific modifications. Note that in principle you could
>> even leave your primary document output format as
>> `BiocWorkflowTools::f1000_article`. This is because the workflows for the
>> website are build through a call to `rmarkdown::render` by specifying a
>> custom render format, so the output format set in the document header is
>> discarded anyway; it is important only for the vignette included in the
>> package tarball. But it's also fine to use either of the formats provided by
>> BiocStyle.
>>
>> It shouldn't be necessary to use the captioner package anymore, as figure
>> numbering is now handled by bookdown. Gosia used it in cytofWorkflow before
>> I enabled the new html engine for her workflow, which I did only yesterday.
>>
>> Re the specific builder issues: I've deleted the /tmp/udcCache dir on Mac,
>> but it didn't help. I'm not sure yet what could be the problem there. The
>> citeproc conversion error on Windows might be because of the older pandoc
>> version (1.17.2), will look into this. The convert warnings appear because
>> the new BiocStyle formats set the `knitr::opts_chunk(crop=TRUE)` to crop out
>> excessive white space around plots, will try to sanitize this a little bit
>> too.
>>
>> Cheers,
>> Andrzej
>>
>> On Thu, Aug 31, 2017 at 8:15 PM, Leonardo Collado Torres
>> <lcollado at jhu.edu> wrote:
>>>
>>> Hi,
>>>
>>> I recently got a workflow accepted and I've been trying to get the
>>> workflow builder to successfully complete on Mac and Windows which is
>>> based on bioc-release. Note that the workflow did build properly on
>>> those operating systems using the SPB (bioc-devel).
>>>
>>>
>>> ## Linux: working
>>>
>>> After looking at
>>>
>>> https://hedgehog.fhcrc.org/bioconductor/trunk/madman/workflows/cytofWorkflow
>>> and
>>> https://hedgehog.fhcrc.org/bioconductor/trunk/madman/workflows/rnaseqGene
>>> I was able to:
>>>
>>> (1) get the figure references and captions working using captioner
>>> (2) actually get the figures to show
>>> (3) use the new layout (thanks to vignettes/.html_output)
>>>
>>> Since the build completes on Linux,
>>> http://bioconductor.org/help/workflows/recountWorkflow/ is now live.
>>> Only the links to the Mac and Windows binaries fail.
>>>
>>>
>>> ## Mac: could be a permissions issue
>>>
>>> The issue with Mac is (details at
>>>
>>> http://docbuilder.bioconductor.org:8080/job/recountWorkflow/label=vagrantmac/9/console)
>>> is that I eventually run into this error:
>>>
>>> Quitting from lines 735-747 (recount-workflow.Rmd)
>>> Error: processing vignette 'recount-workflow.Rmd' failed with
>>> diagnostics:
>>> UCSC library operation failed
>>> Execution halted
>>>
>>>
>>> The referenced lines break because of:
>>>
>>> regions <- expressed_regions("SRP045638", "chr21", cutoff = 5L,
>>> maxClusterGap = 3000L)
>>>
>>> This code uses rtracklayer::import.bw() with a URL that gets
>>> forwarded. I made sure that I am requiring the latest bioc-release
>>> rtracklayer in the description file, so this leads me to think that
>>> this issue is a repeat of
>>> https://stat.ethz.ch/pipermail/bioc-devel/2016-August/009599.html
>>> where the solution involved changing some permissions. Dan stated in
>>> that thread (3rd email): "Actually it looks like it was a permissions
>>> issue with the directory /tmp/udcCache. I removed this directory (as
>>> superuser) and that error no longer happens."
>>>
>>> Does this sound like something that could be happening in the Mac
>>> builder?
>>>
>>>
>>>
>>> ## Windows: figure and pandoc-citeproc issues
>>>
>>>
>>> In Windows, I see warnings like this:
>>>
>>> Invalid Parameter - -trim
>>> Warning: running command 'C:\Windows\system32\cmd.exe /c convert
>>> "exondeanalysis1-1.png" -trim "exondeanalysis1-1.png"' had status 4
>>> Warning in shell(paste(c(cmd, args), collapse = " ")) :
>>> 'convert "exondeanalysis1-1.png" -trim "exondeanalysis1-1.png"'
>>> execution failed with error code 4
>>>
>>> Details at
>>> http://docbuilder.bioconductor.org:8080/job/recountWorkflow/label=winbuilder1/9/console.
>>> I originally thought that it was related to the figure paths, which is
>>> why I added this knitr code:
>>>
>>> knitr::opts_chunk$set(fig.path = "")
>>>
>>> But that didn't resolve the issue. I get these warnings only with
>>> BiocStyle::html_document2 and not with BiocStyle::html_document
>>>
>>> (http://docbuilder.bioconductor.org:8080/job/recountWorkflow/label=winbuilder1/7/console
>>> using v0.99.28).
>>>
>>> Ultimately, the warnings might not matter. Though I don't see them on
>>> the build reports for other workflows (cytofWorkflow, rnaseqGene).
>>>
>>> The Windows builds fail (with BiocStyle::html_document2 or
>>> BiocStyle::html_document) with error messages like this:
>>>
>>> pandoc.exe: Error running filter pandoc-citeproc Filter returned error
>>> status 1073807366 Warning: running command
>>> '"C:/Progra~2/Pandoc/pandoc" +RTS -K512m -RTS recount-workflow.utf8.md
>>> --to html --from
>>> markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash
>>> --output recount-workflow.html --smart --email-obfuscation none
>>> --self-contained --standalone --section-divs --table-of-contents
>>> --toc-depth 3 --template
>>> "C:\Windows\TEMP\RtmpchKz3E/BiocStyle/template.html" --no-highlight
>>> --variable highlightjs=1 --number-sections --css
>>> "C:\PROGRA~1\R\R-34~1.0\library\BIOCST~1\RESOUR~1\html\BIOCON~2.CSS"
>>> --variable "theme:bootstrap" --include-in-header
>>> "C:\Windows\TEMP\RtmpchKz3E\rmarkdown-str3ec7c88cea.html" --mathjax
>>> --variable
>>> "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
>>> --bibliography recount-workflow.bib --filter pandoc-citeproc' had
>>> status 83 Error: processing vignette 'recount-workflow.Rmd' failed
>>> with diagnostics: pandoc document conversion failed with error 83
>>> Execution halted
>>>
>>> (from
>>> http://docbuilder.bioconductor.org:8080/job/recountWorkflow/label=winbuilder1/9/console)
>>>
>>>
>>> Googling the error pointed me towards having weird symbols on the .bib
>>> file. So I removed some accents and the like, but I still get this
>>> error. I am using a csl file for the bibliography and like I said,
>>> this works with the SPB whose last build report was
>>>
>>> http://bioconductor.org/spb_reports/recountWorkflow_buildreport_20170807114418.html
>>> (v0.99.18). There's not much different between that version and
>>> 0.99.30 as you can see at
>>> https://github.com/LieberInstitute/recountWorkflow/commits/master **.
>>>
>>> The Linux machine has pandoc 1.19.2.1 installed
>>>
>>> (http://bioconductor.org/help/workflows/recountWorkflow/#session-information)
>>> and I guess that the Windows one has the same version. But maybe that
>>> could be the issue. BiocStyle 2.5.15 is installed in the Linux machine
>>> too (Bioconductor/BiocStyle at 2a1ba75) which is newer than the latest
>>> bioc-release (2.4.1), but again, I assume that the Windows machine has
>>> the same version.
>>>
>>>
>>>
>>> Anyhow, I haven't been able to fix these issues and was wondering if
>>> anyone else had ideas that could resolve them. Regardless, I'm happy
>>> that I got http://bioconductor.org/help/workflows/recountWorkflow/ up
>>> and looking good ^^. It's just the links to the Mac/Windows
>>> binaries...
>>>
>>>
>>> Thank you,
>>> Leo
>>>
>>>
>>>
>>> ** I've been pushing changes via svn, I know that workflows are not
>>> part of the git transition. I just keep everything in sync manually
>>> with the GitHub repo.
>>>
>>>
>>> Leonardo Collado Torres, Ph. D., Data Scientist
>>> Lieber Institute for Brain Development
>>> 855 N Wolfe St, Suite 300
>>> Baltimore, MD 21205
>>> Website: http://lcolladotor.github.io
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>
More information about the Bioc-devel
mailing list