[Bioc-devel] DMRcate: TIMEOUT

Tim Peters t.peters at garvan.org.au
Thu Jun 1 04:39:16 CEST 2017


Thanks Martin and Sean, found the culprit. It's DSS::DMLtest(). Looks 
like it takes much longer under the new version. I should be able to 
attenuate the vignette input considerably so hopefully it will now beat 
the timeout.

For R/3.3.3, on a bsseq object with 5000 CpG sites, DMLtest() under 
DSS_2.14.0 takes roughly 4 seconds:

 > system.time(DSSres <- DMLtest(obj_bsseq, group1=sampnames[1:3], 
group2=sampnames[4:6], smoothing=FALSE))
Estimating dispersion for each CpG site, this will take a while ...
    user  system elapsed
   3.724   0.017   3.738
 > sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.8 (Final)

locale:
  [1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C
  [3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915
  [5] LC_MONETARY=en_US.iso885915 LC_MESSAGES=en_US.iso885915
  [7] LC_PAPER=en_US.iso885915 LC_NAME=C
  [9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C

attached base packages:
  [1] splines   parallel  stats4    stats     graphics  grDevices utils
  [8] datasets  methods   base

other attached packages:
  [1] DSS_2.14.0                 bsseq_1.10.0
  [3] limma_3.30.13              SummarizedExperiment_1.4.0
  [5] Biobase_2.34.0             DMRcatedata_1.10.1
  [7] GenomicRanges_1.26.4       GenomeInfoDb_1.10.3
  [9] IRanges_2.8.2              S4Vectors_0.12.2
[11] BiocGenerics_0.20.0

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.11       XVector_0.14.1     zlibbioc_1.20.0 munsell_0.4.3
  [5] colorspace_1.3-2   lattice_0.20-35    plyr_1.8.4 tools_3.3.3
  [9] grid_3.3.3         data.table_1.10.4  R.oo_1.21.0 gtools_3.5.0
[13] matrixStats_0.52.2 permute_0.9-4      Matrix_1.2-10 R.utils_2.5.0
[17] bitops_1.0-6       RCurl_1.95-4.8     R.methodsS3_1.7.1 scales_0.4.1
[21] locfit_1.5-9.1


And under R/3.4.0 under DSS_2.16.0 it takes 108 seconds:


 > system.time(DSSres <- DMLtest(obj_bsseq, group1=sampnames[1:3], 
group2=sampnames[4:6], smoothing=FALSE))
Estimating dispersion for each CpG site, this will take a while ...
    user  system elapsed
108.305   0.184 108.596
 > sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.0
LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:
  [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C LC_TIME=en_AU.UTF-8        
LC_COLLATE=en_AU.UTF-8 LC_MONETARY=en_AU.UTF-8
  [6] LC_MESSAGES=en_AU.UTF-8    LC_PAPER=en_AU.UTF-8 
LC_NAME=C                  LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C

attached base packages:
  [1] splines   parallel  stats4    stats     graphics  grDevices 
utils     datasets  methods   base

other attached packages:
  [1] DSS_2.16.0                 bsseq_1.12.1 SummarizedExperiment_1.6.2 
DelayedArray_0.2.4 matrixStats_0.52.2
  [6] Biobase_2.36.2             DMRcatedata_1.12.0 
GenomicRanges_1.28.3       GenomeInfoDb_1.12.1 IRanges_2.10.2
[11] S4Vectors_0.14.2           BiocGenerics_0.22.0

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.11            XVector_0.16.0 zlibbioc_1.22.0         
munsell_0.4.3 colorspace_1.3-2        lattice_0.20-35
  [7] plyr_1.8.4              tools_3.4.0 grid_3.4.0              
data.table_1.10.4 R.oo_1.21.0             gtools_3.5.0
[13] permute_0.9-4           Matrix_1.2-10 GenomeInfoDbData_0.99.0 
R.utils_2.5.0 bitops_1.0-6            RCurl_1.95-4.8
[19] limma_3.32.2            compiler_3.4.0 R.methodsS3_1.7.1       
scales_0.4.1 locfit_1.5-9.1


Thanks for your recommendations, I should be fine from here on.

Cheers,

Tim


On 31/05/17 21:36, Martin Morgan wrote:
> On 05/31/2017 06:18 AM, Sean Davis wrote:
>> Hi, Tim.
>>
>> Have you tried building the vignette independently, separate from the
>> package build process? Doing so might give you some hints about which 
>> code
>> blocks are the culprits.
>
> Also
>
> > Stangle("DMRcate.Rnw")
> Writing to file DMRcate.R
> > source("DMRcate.R", echo=TRUE, max=Inf)
>
> to generate the R code and process it.
>
> Martin
>
>>
>> Sean
>>
>>
>> On Wed, May 31, 2017 at 1:24 AM, Tim Peters <t.peters at garvan.org.au> 
>> wrote:
>>
>>> Hi bioc,
>>>
>>> Recently, under R/3.4.0, the newest version of DMRcate is taking 47mins
>>> to build on my local machine and I am getting TIMEOUTs on the build on
>>> the checkResults page
>>> http://master.bioconductor.org/checkResults/3.5/bioc-
>>> LATEST/DMRcate/malbec2-buildsrc.html.
>>> I have not made any changes to the attached data package or routines
>>> that warrant this increase in time. The vignette building in particular
>>> takes up the bulk of the time and this is where the build hangs.
>>> Building the same source on R/3.3.3 only takes 4mins 53secs by 
>>> comparison.
>>>
>>> Screendump below. Can someone provide an insight into what they 
>>> think is
>>> happening here?
>>>
>>> timpet at clark-lab:~/Documents$ time R CMD build DMRcate
>>> * checking for file ‘DMRcate/DESCRIPTION’ ... OK
>>> * preparing ‘DMRcate’:
>>> * checking DESCRIPTION meta-information ... OK
>>> * installing the package to build vignettes
>>> * creating vignettes ... OK
>>> * checking for LF line-endings in source and make files
>>> * checking for empty or unneeded directories
>>> * building ‘DMRcate_1.12.1.tar.gz’
>>>
>>>
>>> real    47m33.472s
>>> user    46m55.023s
>>> sys    0m4.647s
>>> timpet at clark-lab:~/Documents$ R -e "sessionInfo()"
>>>
>>> R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
>>> Copyright (C) 2017 The R Foundation for Statistical Computing
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>>> You are welcome to redistribute it under certain conditions.
>>> Type 'license()' or 'licence()' for distribution details.
>>>
>>>     Natural language support but running in an English locale
>>>
>>> R is a collaborative project with many contributors.
>>> Type 'contributors()' for more information and
>>> 'citation()' on how to cite R or R packages in publications.
>>>
>>> Type 'demo()' for some demos, 'help()' for on-line help, or
>>> 'help.start()' for an HTML browser interface to help.
>>> Type 'q()' to quit R.
>>>
>>>   > sessionInfo()
>>> R version 3.4.0 (2017-04-21)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> Running under: Ubuntu 14.04.5 LTS
>>>
>>> Matrix products: default
>>> BLAS: /usr/lib/libblas/libblas.so.3.0
>>> LAPACK: /usr/lib/lapack/liblapack.so.3.0
>>>
>>> locale:
>>>    [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C
>>>    [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8
>>>    [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8
>>>    [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C
>>>    [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods base
>>>
>>> loaded via a namespace (and not attached):
>>> [1] compiler_3.4.0
>>>
>>> Best,
>>> Tim
>>>
>>> -- 
>>> Tim Peters, PhD
>>>
>>> Bioinformatics Research Officer | Epigenetics Research Laboratory |
>>> Genomics and Epigenetics Division
>>>
>>> Garvan Institute of Medical Research
>>>
>>> 384 Victoria St., Darlinghurst, NSW, Australia 2010
>>>
>>> Tel: +61 (2) 9295 8319
>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>>
>>
>
>
> This email message may contain legally privileged and/or confidential 
> information.  If you are not the intended recipient(s), or the 
> employee or agent responsible for the delivery of this message to the 
> intended recipient(s), you are hereby notified that any disclosure, 
> copying, distribution, or use of this email message is prohibited.  If 
> you have received this message in error, please notify the sender 
> immediately by e-mail and delete this email message from your 
> computer. Thank you.
>

-- 
Tim Peters, PhD

Bioinformatics Research Officer | Epigenetics Research Laboratory | 
Genomics and Epigenetics Division

Garvan Institute of Medical Research

384 Victoria St., Darlinghurst, NSW, Australia 2010

Tel: +61 (2) 9295 8319


	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list