[Bioc-devel] Issue importing bigwig files with rtracklayer from Amazon Cloud Drive

Michael Lawrence lawrence.michael at gene.com
Fri May 6 02:10:35 CEST 2016


I checked in something that tries to find openssl automatically on the Mac.

It looks like AWS is for some reason returning 404 for the HEAD command
that the UCSC library uses the get info about the file like the content
size. Same thing happens when I play around in Firefox's developer tools.
The error response header claims a JSON content type, but no JSON is
actually sent, so there is no further description of the error. I think
this is a bug in Amazon.

Seems like for now you'll need to download the file first.

Michael

On Thu, May 5, 2016 at 2:46 PM, Leonardo Collado Torres <lcollado at jhu.edu>
wrote:

> Hi Michael,
>
> I forgot about pkg-util (just did a fresh BioC 3.3 install). I assumed
> the OS X binary would work out of the box.
>
> Anyhow, I installed rtracklayer (release) manually and got another
> error (slightly different message now).
>
>
>
>
> $ svn co
> https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_3/madman/Rpacks/rtracklayer
> $ R CMD INSTALL rtracklayer
> Loading required package: colorout
> * installing to library
> ‘/Library/Frameworks/R.framework/Versions/3.3release/Resources/library’
> * installing *source* package ‘rtracklayer’ ...
> checking for pkg-config... /usr/local/bin/pkg-config
> checking pkg-config is at least version 0.9.0... yes
> checking for OPENSSL... yes
> ## more output
>
> $ R
> > library('rtracklayer')
> > unshorten_url <- function(uri) {
> +     require('RCurl')
> +     opts <- list(
> +         followlocation = TRUE,  # resolve redirects
> +         ssl.verifyhost = FALSE, # suppress certain SSL errors
> +         ssl.verifypeer = FALSE,
> +         nobody = TRUE, # perform HEAD request
> +         verbose = FALSE
> +     )
> +     curlhandle <- getCurlHandle(.opts = opts)
> +     getURL(uri, curl = curlhandle)
> +     info <- getCurlInfo(curlhandle)
> +     rm(curlhandle)  # release the curlhandle!
> +     info$effective.url
> + }
> > url <- unshorten_url('
> http://duffel.rail.bio/recount/DRP000366/bw/DRR000897.bw')
> Loading required package: RCurl
> Loading required package: bitops
> > url
> [1] "
> https://content-na.drive.amazonaws.com/cdproxy/templink/usTQCr2pAaI3tTps4AFQuz1H9kmm23EDYy39SQ3ke5EuFiZq5
> "
> > x <- import.bw(url, as = 'RleList')
> Error in seqinfo(ranges) : UCSC library operation failed
> In addition: Warning message:
> In seqinfo(ranges) :
>   Couldn't open
>
> https://content-na.drive.amazonaws.com/cdproxy/templink/usTQCr2pAaI3tTps4AFQuz1H9kmm23EDYy39SQ3ke5EuFiZq5
> > x <- import.bw('
> http://content-na.drive.amazonaws.com/cdproxy/templink/usTQCr2pAaI3tTps4AFQuz1H9kmm23EDYy39SQ3ke5EuFiZq5
> ')
> Error in seqinfo(ranges) : UCSC library operation failed
> In addition: Warning messages:
> 1: In seqinfo(ranges) :
>   TCP non-blocking connect() to content-na.drive.amazonaws.com
> timed-out in select() after 10000 milliseconds - Cancelling!
> 2: In seqinfo(ranges) :
>   Couldn't open
>
> http://content-na.drive.amazonaws.com/cdproxy/templink/usTQCr2pAaI3tTps4AFQuz1H9kmm23EDYy39SQ3ke5EuFiZq5
> > ## Reproducibility info
> > message(Sys.time())
> 2016-05-05 17:38:30
> > options(width = 120)
> > devtools::session_info()
> Session info
> -----------------------------------------------------------------------------------------------------------
>  setting  value
>  version  R version 3.3.0 RC (2016-05-01 r70572)
>  system   x86_64, darwin13.4.0
>  ui       X11
>  language (EN)
>  collate  en_US.UTF-8
>  tz       America/New_York
>  date     2016-05-05
>
> Packages
> ---------------------------------------------------------------------------------------------------------------
>  package              * version  date       source
>  Biobase                2.32.0   2016-05-04 Bioconductor
>  BiocGenerics         * 0.18.0   2016-05-04 Bioconductor
>  BiocParallel           1.6.0    2016-05-04 Bioconductor
>  Biostrings             2.40.0   2016-05-04 Bioconductor
>  bitops               * 1.0-6    2013-08-17 CRAN (R 3.3.0)
>  colorout             * 1.1-2    2016-05-05 Github
> (jalvesaq/colorout at 6538970)
>  devtools               1.11.1   2016-04-21 CRAN (R 3.3.0)
>  digest                 0.6.9    2016-01-08 CRAN (R 3.3.0)
>  GenomeInfoDb         * 1.8.0    2016-05-04 Bioconductor
>  GenomicAlignments      1.8.0    2016-05-04 Bioconductor
>  GenomicRanges        * 1.24.0   2016-05-04 Bioconductor
>  IRanges              * 2.6.0    2016-05-04 Bioconductor
>  memoise                1.0.0    2016-01-29 CRAN (R 3.3.0)
>  RCurl                * 1.95-4.8 2016-03-01 CRAN (R 3.3.0)
>  Rsamtools              1.24.0   2016-05-04 Bioconductor
>  rtracklayer          * 1.32.0   2016-05-05 Bioconductor
>  S4Vectors            * 0.10.0   2016-05-04 Bioconductor
>  SummarizedExperiment   1.2.0    2016-05-04 Bioconductor
>  withr                  1.0.1    2016-02-04 CRAN (R 3.3.0)
>  XML                    3.98-1.4 2016-03-01 CRAN (R 3.3.0)
>  XVector                0.12.0   2016-05-04 Bioconductor
>  zlibbioc               1.18.0   2016-05-04 Bioconductor
> >
>
> On Thu, May 5, 2016 at 5:24 PM, Michael Lawrence
> <lawrence.michael at gene.com> wrote:
> > The URL redirection is something I can try to add. For the other error,
> you
> > need to get openssl installed and made visible to pkg-config, so that
> > rtracklayer finds it during its build process.
> >
> > Michael
> >
> > On Thu, May 5, 2016 at 2:01 PM, Leonardo Collado Torres <
> lcollado at jhu.edu>
> > wrote:
> >>
> >> Hi Michael,
> >>
> >> I have a use case that is similar to
> >> https://support.bioconductor.org/p/81267/#82142 and looks to me like
> >> it might need some changes in rtracklayer to work. That's why I'm
> >> posting it here this time.
> >>
> >> Basically, I'm trying to use rtracklayer to import a bigwig file over
> >> the web which is in a different type of url than before. Using
> >> utils::download.file() with the defaults doesn't work, I have to use
> >> method = 'curl' and extra = '-L'.
> >>
> >> More specifically, the original url
> >> http://duffel.rail.bio/recount/DRP000366/bw/DRR000897.bw has an
> >> effective url
> >>
> https://content-na.drive.amazonaws.com/cdproxy/templink/i_aQAPZJkJ9d9lN1NO5DJJtlbpvAdgbNuc1SkqSTHFouFiZq5
> >>
> >> Now, using the second url with utils::download.file() and default
> >> methods also doesn't work. It does on the browser though.
> >>
> >>
> >> As you can see, downloading the file doesn't work out of the box.
> >> Which I guess that it's not surprising that using rtracklayer I get
> >> errors like:
> >>
> >> In seqinfo(ranges) :
> >>   No openssl available in netConnectHttps for
> >> content-na.drive.amazonaws.com : 443
> >>
> >> You can find further details (code and log file) at
> >> https://gist.github.com/lcolladotor/c500dd79d49aed1ef33ade5417111453
> >>
> >> Thanks,
> >> Leo
> >
> >
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list