[Bioc-devel] Data package timeouts

Leonardo Collado Torres lcollado at jhu.edu
Tue Dec 5 19:48:09 CET 2017


Hi Sean,

I'm still seeing some timeouts with GEOquery 2.46.10 on bioc-release.
Here's a quick example:

library('GEOquery')
getGEO('GSM1062236', getGPL = FALSE)

I found it from
https://github.com/leekgroup/recount/blob/master/tests/testthat/test-misc.R#L19

Best,
Leo


> library('GEOquery')
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
    colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
    mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)
> getGEO('GSM1062236', getGPL = FALSE)
File stored at:
/var/folders/cx/n9s558kx6fb7jf5z_pgszgb80000gn/T//RtmpAyQR3U/GSM1062236.soft
## Force terminate after a long running time
^C
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] GEOquery_2.46.10    Biobase_2.38.0      BiocGenerics_0.24.0
[4] colorout_1.1-2

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14     tidyr_0.7.2      dplyr_0.7.4      assertthat_0.2.0
 [5] R6_2.2.2         magrittr_1.5     rlang_0.1.4      bindrcpp_0.2
 [9] limma_3.34.2     xml2_1.1.1       readr_1.1.1      glue_1.2.0
[13] purrr_0.2.4      hms_0.4.0        compiler_3.4.2   pkgconfig_2.0.1
[17] bindr_0.1        tibble_1.3.4


On Thu, Nov 30, 2017 at 11:56 AM, Leonardo Collado Torres
<lcollado at jhu.edu> wrote:
>
> Thanks Sean! I was seeing timeouts also in recount related to GEOquery which I just recently looked into.
>
> On Thu, Nov 30, 2017 at 11:14 AM, Sean Davis <seandavi at gmail.com> wrote:
>>
>>
>> On Thu, Nov 30, 2017 at 6:05 AM, Mike Smith <grimbough at gmail.com> wrote:
>>
>> > Thanks for the speedy response Sean.  I'll switch back to the version
>> > using a file name shortly.
>> >
>>
>> No problem. Let me know if it does not work as expected.
>>
>> Sean
>>
>>
>>
>> >
>> > Cheers,
>> > Mike
>> >
>> > On 30 November 2017 at 11:20, Sean Davis <seandavi at gmail.com> wrote:
>> >
>> >> Thanks for the report, Mike.
>> >>
>> >> The problem was (specifically) in parsing a GSEMatrix file using a
>> >> filename. This should be fixed in versions 2.46.10 (release) and 2.47.12
>> >> (devel).
>> >>
>> >> Sean
>> >>
>> >>
>> >> On Thu, Nov 30, 2017 at 4:09 AM, Mike Smith <grimbough at gmail.com> wrote:
>> >>
>> >>> Hi Mike,
>> >>>
>> >>> I was experiencing similar problems with the BeadArrayUseCases vignette,
>> >>> where using getGEO() from GEOquery was getting stuck in a (seemingly)
>> >>> infinite loop processing a GSE series matrix file.  It looks like both of
>> >>> your examples try to do this too, so I suspect it's a similar issue.  I
>> >>> think the format of those files has changed recently and it seems to be
>> >>> causing a fair few issues with GEOquery.
>> >>>
>> >>> I temporarily settled a solution by getting querying GEO directly rather
>> >>> than using a local file, but it would be nice to get it back working as
>> >>> intended.
>> >>>
>> >>> Mike
>> >>>
>> >>> On 29 November 2017 at 18:56, Michael Love <michaelisaiahlove at gmail.com>
>> >>> wrote:
>> >>>
>> >>> > I got simultaneous timeout notices for 'airway' and 'parathyroidSE' on
>> >>> > both release and devel machines (release was fine leading up to the
>> >>> > Bioc release).
>> >>> >
>> >>> > Not sure what's the issue, I haven't changed these packages in a
>> >>> > while. I checked these out and these both build fine and in ~30s on my
>> >>> > machine (devel branch).
>> >>> >
>> >>> > Here are the reports for release:
>> >>> >
>> >>> > http://bioconductor.org/checkResults/release/data-
>> >>> > experiment-LATEST/airway/malbec1-buildsrc.html
>> >>> > http://bioconductor.org/checkResults/release/data-experiment-LATEST/
>> >>> > parathyroidSE/malbec1-buildsrc.html
>> >>> >
>> >>> > The vignettes are here:
>> >>> >
>> >>> > http://bioconductor.org/packages/3.6/data/experiment/
>> >>> > vignettes/airway/inst/doc/airway.html
>> >>> > http://bioconductor.org/packages/3.6/data/experiment/
>> >>> > vignettes/parathyroidSE/inst/doc/parathyroidSE.pdf
>> >>> >
>> >>> > best,
>> >>> > Mike
>> >>> >
>> >>> > _______________________________________________
>> >>> > Bioc-devel at r-project.org mailing list
>> >>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>> >
>> >>>
>> >>>         [[alternative HTML version deleted]]
>> >>>
>> >>> _______________________________________________
>> >>> Bioc-devel at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Sean Davis, MD, PhD
>> >> Center for Cancer Research
>> >> National Cancer Institute
>> >> National Institutes of Health
>> >> Bethesda, MD 20892
>> >> https://seandavi.github.io/
>> >> https://twitter.com/seandavis12
>> >>
>> >
>> >
>>
>>
>> --
>> Sean Davis, MD, PhD
>> Center for Cancer Research
>> National Cancer Institute
>> National Institutes of Health
>> Bethesda, MD 20892
>> https://seandavi.github.io/
>> https://twitter.com/seandavis12
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>



More information about the Bioc-devel mailing list