[Bioc-devel] build errors: "Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed"
Paul Shannon
p@u|@thurmond@@h@nnon @end|ng |rom gm@||@com
Fri May 10 21:06:28 CEST 2019
Belated thanks, Herve, for getting this fixed for the release.
I think the same problem has popped up again, as seen in these latest trena build report:
o ERROR for 'R CMD check' on malbec2. See the details here:
https://master.bioconductor.org/checkResults/3.9/bioc-LATEST/trena/malbec2-checksrc.html
Warning in .seqlengths_TwoBitFile(x) : mustOpen: Can't open /home/biocbuild/bbs-3.9-bioc/R/library/BSgenome.Hsapiens.UCSC.hg38/extdata/single_sequences.2bit to read: No such file or directory Timing stopped at: 6.522 0 6.522 Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
- Paul
> On Apr 24, 2019, at 10:39 PM, Pages, Herve <hpages using fredhutch.org> wrote:
>
> Hi Paul,
>
> Something/someone is definitely re-installing the
> BSgenome.Hsapiens.UCSC.hg38 while 'R CMD check trena' is running on the
> build machines. This has happened consistently for several consecutive
> nights on malbec2 (BioC 3.9 builds) and malbec1 (BioC 3.10 builds) where
> I've been monitoring this.
>
> The builds are parallelized at the "top level" i.e. several 'R CMD
> check' instances are running concurrently on different packages at any
> given time (e.g. 15 concurrent instances on malbec1 & malbec2). So we
> cannot exclude the possibility that another package could be pulling the
> rug from under trena's feet. However, the exact set of packages that is
> being checked at the time that BSgenome.Hsapiens.UCSC.hg38 gets
> re-installed will typically change from one build to the next and also
> across build machines. This makes it unlikely that the culprit is
> another package.
>
> Anyway, just to make sure, I've identified the 15 packages that were
> running at the time BSgenome.Hsapiens.UCSC.hg38 got re-installed last
> night on malbec1 (BioC 3.10 builds) and manually 'R CMD check'ed them
> (including trena which is one of them). None of them re-installed
> BSgenome.Hsapiens.UCSC.hg38. All this to say that I've not been able to
> reproduce this problem so far in an interactive session on the build
> machines.
>
> Puzzling! (and frustrating) I'll keep investigating...
>
> Note that trena is currently at version 1.5.14 in git but the last
> version of the source package that propagated is 1.5.8. Version 1.5.9
> (from Dec 6, 2018) and successive versions never seem to have propagated
> which suggests that the package has been erroring on malbec2 since Dec
> 2018. This makes it hard to know since when trena has been having the
> "UCSC library operation failed" problem on the build machines.
>
> Finally, another intriguing thing is that, according to the lastest 3.8
> build result, trena's unit tests also seemed to have a problem accessing
> a file that belongs to another package:
>
> https://bioconductor.org/checkResults/3.8/bioc-20190416/trena/merida1-checksrc.html
>
> Not the same problem but similar (and this time on Mac and not on
> Linux). Very puzzling!
>
> H.
>
>
> On 4/23/19 11:29, Paul Shannon wrote:
>> Hi Herve,
>>
>> Thanks for your reply!
>>
>>> Is there a possibility that trena's code is having one worker
>>> downloading/re-installing BSgenome.Hsapiens.UCSC.hg38 while at the same
>>> time another worker is trying to access it?
>> I don’t think any download or reinstalling happens. Several genome packages (hg38, hg19, mm10) are imported by trena as specified in the DESCRIPTION file, and so I assume they must be present after trena is built and installed. Thus - and here’s where I may be confused - there should be nothing to trigger download or re-install as the tests, examples and vignettes are run.
>>
>> In the constructor of the MotifMatcher class, this assignment is made
>>
>> if(genomeName == "hg38"){
>> reference.genome <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
>> }
>>
>> And used later like this:
>>
>> seqs <- as.character(BSgenome::getSeq(obj using reference.genome, gr.regions))
>>
>> Hence my suggestion that no download or install takes place at run time.
>>
>>
>> In the current design of the unit tests for MotifMatcher, I call the constructor in each test:
>>
>> jaspar.human.pfms <- as.list(query (query(MotifDb, "sapiens"), "jaspar2016"))
>> motifMatcher <- MotifMatcher(genomeName="hg38", pfms=jaspar.human.pfms, quiet=TRUE)
>>
>> For what it’s worth, this code is unchanged in the last year, has run fine on the build system until recently, and passes R CMD check under R3.6.0beta on ubuntu for me. There is no parallelization in this class - but maybe the build system introduces some at a higher level?
>>
>> I can condition these failing tests on hostname in order to pass the build tests if that is not too much of a dodge.
>>
>> - Paul
>>
>>
>>> On Apr 23, 2019, at 12:19 AM, Pages, Herve <hpages using fredhutch.org> wrote:
>>>
>>> Hi Paul,
>>>
>>> Is there a possibility that trena's code is having one worker
>>> downloading/re-installing BSgenome.Hsapiens.UCSC.hg38 while at the same
>>> time another worker is trying to access it?
>>>
>>> The reason I suspect something like this is that it seems that
>>> BSgenome.Hsapiens.UCSC.hg38 gets reinstalled every night on the builders
>>> and that this happens at the time the build system is running 'R CMD
>>> check' on trena.
>>>
>>> Package vignettes, examples, and unit tests should avoid re-installing
>>> packages.
>>>
>>> H.
>>>
>>> On 4/22/19 15:01, Paul Shannon wrote:
>>>> I cannot reproduce daily build failures found in the trena package by the build system. The build report shows:
>>>>
>>>> trena RUnit Tests - 86 test functions, 7 errors, 0 failures
>>>>
>>>> ERROR in test_.injectSnp: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_bugInStartEndOfMinusStrandHits: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_findMatchesByChromosomalRegion: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_findMatchesByChromosomalRegion.twoAlternateAlleles: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_findMatchesByMultipleChromosomalRegions: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_getSequence: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>> ERROR in test_noMatch: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>>>>
>>>> This seems similar to a bioc support exchange from two years ago, which may suggest that the build system's BSgenome.Hsapiens.UCSC.hg38 is the locus of the problem. I offer suggestion very tentatively.
>>>>
>>>> support https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_95963_&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=1AJWecG5cm0EI_BZG7zYbHNZa3JkQY8pdsJFahrtpIU&s=2WHZQbOLmt-jvKlwVBty43jY5JcBt2U_sdqZDqRxEOY&e=
>>>>
>>>> Any suggestions?
>>>>
>>>> - Paul
>>>>
>>>> sessionInfo() # from my clean R CMD check
>>>> R version 3.6.0 beta (2019-04-11 r76379)
>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>> Running under: Ubuntu 16.04.5 LTS
>>>>
>>>> Matrix products: default
>>>> BLAS: /local/users/pshannon/src/R-beta/lib/libRblas.so
>>>> LAPACK: /local/users/pshannon/src/R-beta/lib/libRlapack.so
>>>>
>>>> locale:
>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats4 parallel stats graphics grDevices utils datasets
>>>> [8] methods base
>>>>
>>>> other attached packages:
>>>> [1] RPostgreSQL_0.6-2 DBI_1.0.0 RUnit_0.4.32
>>>> [4] trena_1.5.14 MotifDb_1.22.0 Biostrings_2.48.0
>>>> [7] XVector_0.20.0 IRanges_2.14.12 S4Vectors_0.18.3
>>>> [10] BiocGenerics_0.26.0 glmnet_2.0-16 foreach_1.4.4
>>>> [13] Matrix_1.2-17
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] SummarizedExperiment_1.10.1 lassopv_0.2.0
>>>> [3] progress_1.2.0 lattice_0.20-38
>>>> [5] rtracklayer_1.40.6 blob_1.1.1
>>>> [7] XML_3.98-1.19 rlang_0.3.4
>>>> [9] flare_1.6.0 BiocParallel_1.14.2
>>>> [11] bit64_0.9-7 splitstackshape_1.4.8
>>>> [13] matrixStats_0.54.0 GenomeInfoDbData_1.1.0
>>>> [15] stringr_1.4.0 zlibbioc_1.26.0
>>>> [17] codetools_0.2-16 memoise_1.1.0
>>>> [19] Biobase_2.40.0 biomaRt_2.36.1
>>>> [21] GenomeInfoDb_1.16.0 curl_3.3
>>>> [23] AnnotationDbi_1.42.1 lars_1.2
>>>> [25] Rcpp_1.0.1 BSgenome_1.48.0
>>>> [27] DelayedArray_0.6.6 org.Hs.eg.db_3.6.0
>>>> [29] bit_1.1-14 Rsamtools_1.32.3
>>>> [31] BSgenome.Hsapiens.UCSC.hg38_1.4.1 RMySQL_0.10.17
>>>> [33] hms_0.4.2 digest_0.6.18
>>>> [35] stringi_1.4.3 GenomicRanges_1.32.7
>>>> [37] grid_3.6.0 tools_3.6.0
>>>> [39] bitops_1.0-6 magrittr_1.5
>>>> [41] RCurl_1.95-4.12 RSQLite_2.1.1
>>>> [43] randomForest_4.6-14 crayon_1.3.4
>>>> [45] vbsr_0.0.5 pkgconfig_2.0.2
>>>> [47] MASS_7.3-51.4 data.table_1.12.2
>>>> [49] prettyunits_1.0.2 httr_1.4.0
>>>> [51] assertthat_0.2.1 iterators_1.0.10
>>>> [53] R6_2.4.0 GenomicAlignments_1.16.0
>>>> [55] igraph_1.2.4.1 compiler_3.6.0
>>>> _______________________________________________
>>>> Bioc-devel using r-project.org mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=1AJWecG5cm0EI_BZG7zYbHNZa3JkQY8pdsJFahrtpIU&s=Hd_vdYy62MOejkKAH21haaIJ0HMvjDSH-BxAjBCxSjk&e=
>>> --
>>> Hervé Pagès
>>>
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1-B514
>>> P.O. Box 19024
>>> Seattle, WA 98109-1024
>>>
>>> E-mail: hpages using fredhutch.org
>>> Phone: (206) 667-5791
>>> Fax: (206) 667-1319
>>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages using fredhutch.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
More information about the Bioc-devel
mailing list