[Bioc-devel] build errors: "Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed"

Paul Shannon p@u|@thurmond@@h@nnon @end|ng |rom gm@||@com
Tue Apr 23 20:29:32 CEST 2019


Hi Herve,

Thanks for your reply!

> Is there a possibility that trena's code is having one worker 
> downloading/re-installing BSgenome.Hsapiens.UCSC.hg38 while at the same 
> time another worker is trying to access it?

I don’t think any download or reinstalling happens.  Several genome packages (hg38, hg19, mm10) are imported by trena as specified in the DESCRIPTION file, and so I assume they must be present after trena is built and installed.  Thus - and here’s where I may be confused - there should be nothing to trigger download or re-install as the tests, examples and vignettes are run.

In the constructor of the MotifMatcher class, this assignment is made

    if(genomeName == "hg38"){
       reference.genome <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
       }

And used later like this:

    seqs <- as.character(BSgenome::getSeq(obj using reference.genome, gr.regions))

Hence my suggestion that no download or install takes place at run time.


In the current design of the unit tests for MotifMatcher, I call the constructor in each test:

   jaspar.human.pfms <- as.list(query (query(MotifDb, "sapiens"), "jaspar2016"))
   motifMatcher <- MotifMatcher(genomeName="hg38", pfms=jaspar.human.pfms, quiet=TRUE)

For what it’s worth, this code is unchanged in the last year, has run fine on the build system until recently, and passes R CMD check under R3.6.0beta on ubuntu for me.  There is no parallelization in this class - but maybe the build system introduces some at a higher level?

I can condition these failing tests on hostname in order to pass the build tests if that is not too much of a dodge.

 - Paul


> On Apr 23, 2019, at 12:19 AM, Pages, Herve <hpages using fredhutch.org> wrote:
> 
> Hi Paul,
> 
> Is there a possibility that trena's code is having one worker 
> downloading/re-installing BSgenome.Hsapiens.UCSC.hg38 while at the same 
> time another worker is trying to access it?
> 
> The reason I suspect something like this is that it seems that 
> BSgenome.Hsapiens.UCSC.hg38 gets reinstalled every night on the builders 
> and that this happens at the time the build system is running 'R CMD 
> check' on trena.
> 
> Package vignettes, examples, and unit tests should avoid re-installing 
> packages.
> 
> H.
> 
> On 4/22/19 15:01, Paul Shannon wrote:
>> I cannot reproduce daily build failures found in the trena package by the build system.  The build report shows:
>> 
>> trena RUnit Tests - 86 test functions, 7 errors, 0 failures
>> 
>> ERROR in test_.injectSnp: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_bugInStartEndOfMinusStrandHits: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_findMatchesByChromosomalRegion: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_findMatchesByChromosomalRegion.twoAlternateAlleles: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_findMatchesByMultipleChromosomalRegions: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_getSequence: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> ERROR in test_noMatch: Error in .seqlengths_TwoBitFile(x) : UCSC library operation failed
>> 
>> This seems similar to a bioc support exchange from two years ago, which may suggest that the build system's BSgenome.Hsapiens.UCSC.hg38 is the locus of the problem.   I offer suggestion very tentatively.
>> 
>>    support https://urldefense.proofpoint.com/v2/url?u=https-3A__support.bioconductor.org_p_95963_&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=1AJWecG5cm0EI_BZG7zYbHNZa3JkQY8pdsJFahrtpIU&s=2WHZQbOLmt-jvKlwVBty43jY5JcBt2U_sdqZDqRxEOY&e=
>> 
>> Any suggestions?
>> 
>>  - Paul
>> 
>> sessionInfo()  # from my clean R CMD check
>> R version 3.6.0 beta (2019-04-11 r76379)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 16.04.5 LTS
>> 
>> Matrix products: default
>> BLAS:   /local/users/pshannon/src/R-beta/lib/libRblas.so
>> LAPACK: /local/users/pshannon/src/R-beta/lib/libRlapack.so
>> 
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats4    parallel  stats     graphics  grDevices utils     datasets
>> [8] methods   base
>> 
>> other attached packages:
>>  [1] RPostgreSQL_0.6-2   DBI_1.0.0           RUnit_0.4.32
>>  [4] trena_1.5.14        MotifDb_1.22.0      Biostrings_2.48.0
>>  [7] XVector_0.20.0      IRanges_2.14.12     S4Vectors_0.18.3
>> [10] BiocGenerics_0.26.0 glmnet_2.0-16       foreach_1.4.4
>> [13] Matrix_1.2-17
>> 
>> loaded via a namespace (and not attached):
>>  [1] SummarizedExperiment_1.10.1       lassopv_0.2.0
>>  [3] progress_1.2.0                    lattice_0.20-38
>>  [5] rtracklayer_1.40.6                blob_1.1.1
>>  [7] XML_3.98-1.19                     rlang_0.3.4
>>  [9] flare_1.6.0                       BiocParallel_1.14.2
>> [11] bit64_0.9-7                       splitstackshape_1.4.8
>> [13] matrixStats_0.54.0                GenomeInfoDbData_1.1.0
>> [15] stringr_1.4.0                     zlibbioc_1.26.0
>> [17] codetools_0.2-16                  memoise_1.1.0
>> [19] Biobase_2.40.0                    biomaRt_2.36.1
>> [21] GenomeInfoDb_1.16.0               curl_3.3
>> [23] AnnotationDbi_1.42.1              lars_1.2
>> [25] Rcpp_1.0.1                        BSgenome_1.48.0
>> [27] DelayedArray_0.6.6                org.Hs.eg.db_3.6.0
>> [29] bit_1.1-14                        Rsamtools_1.32.3
>> [31] BSgenome.Hsapiens.UCSC.hg38_1.4.1 RMySQL_0.10.17
>> [33] hms_0.4.2                         digest_0.6.18
>> [35] stringi_1.4.3                     GenomicRanges_1.32.7
>> [37] grid_3.6.0                        tools_3.6.0
>> [39] bitops_1.0-6                      magrittr_1.5
>> [41] RCurl_1.95-4.12                   RSQLite_2.1.1
>> [43] randomForest_4.6-14               crayon_1.3.4
>> [45] vbsr_0.0.5                        pkgconfig_2.0.2
>> [47] MASS_7.3-51.4                     data.table_1.12.2
>> [49] prettyunits_1.0.2                 httr_1.4.0
>> [51] assertthat_0.2.1                  iterators_1.0.10
>> [53] R6_2.4.0                          GenomicAlignments_1.16.0
>> [55] igraph_1.2.4.1                    compiler_3.6.0
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=1AJWecG5cm0EI_BZG7zYbHNZa3JkQY8pdsJFahrtpIU&s=Hd_vdYy62MOejkKAH21haaIJ0HMvjDSH-BxAjBCxSjk&e=
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages using fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
> 



More information about the Bioc-devel mailing list