[Bioc-devel] httr::GET() problem downloading a ExperimentHub resource
Robert Castelo
robert@c@@te|o @end|ng |rom up|@edu
Wed Mar 29 22:08:54 CEST 2023
good catch, but really enigmatic, BAI files work, but BAM don't:
dat <-
read.csv("https://raw.githubusercontent.com/functionalgenomics/gDNAinRNAseqData/devel/inst/extdata/metadata_LiYu22subsetBAMfiles.csv")
rdatapath <- strsplit(dat$RDataPath, ":")
bamfiles <- unlist(rdatapath)[seq(1, 18, 2)]
baifiles <- unlist(rdatapath)[seq(2, 18, 2)]
bamurls <- paste0(dat$Location_Prefix, bamfiles)
baiurls <- paste0(dat$Location_Prefix, baifiles)
## BAM files give error
for (bf in bamurls) {
cat(sprintf("%s\n", basename(bf)))
tryCatch({
curl::curl_fetch_disk(bf, tempfile())
}, error=function(e) message(paste0(e, "\n")))
}
## BAI files do not give error
for (bf in baiurls) {
cat(sprintf("%s\n", basename(bf)))
tryCatch({
curl::curl_fetch_disk(bf, tempfile())
}, error=function(e) message(paste0(e, "\n")))
}
any further idea??
robert.
On 29/3/23 21:10, Martin Morgan wrote:
>
> Not really helpful but this could be simplified a bit by removing the
> redirect from experiment hub, and the layer from httr to curl, so
>
> url =
> "https://functionalgenomics.upf.edu/experimenthub/gdnainrnaseqdata/LiYu22subsetBAMfiles/s32gDNA0.bam"
>
> curl::curl_fetch_disk(url, tempfile())
>
> Error in
> curl::curl_fetch_disk("https://functionalgenomics.upf.edu/experimenthub/gdnainrnaseqdata/LiYu22subsetBAMfiles/s32gDNA0.bam",
> :
>
> Failed writing received data to disk/application
>
> I notice the index file (extension .bai) works; do other BAM files
> work, too?
>
> Martin
>
> *From: *Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of
> Robert Castelo <robert.castelo using upf.edu>
> *Date: *Wednesday, March 29, 2023 at 1:18 PM
> *To: *bioc-devel using r-project.org <bioc-devel using r-project.org>
> *Subject: *[Bioc-devel] httr::GET() problem downloading a
> ExperimentHub resource
>
> hi,
>
> we recently added a few new ExperimentHub resources, consisting of BAM
> files and their corresponding BAI files and hosted in my own server.
> while it seems that they are accessible, they cannot be downloaded
> through the ExperimentHub API. the minimum example reproducing the
> problem is this one (using Bioc devel):
>
> library(ExperimentHub)
> httr::GET("https://experimenthub.bioconductor.org/fetch/8129")
> Error in curl::curl_fetch_memory(url, handle = handle) :
> Failed writing received data to disk/application
>
> while there's apparently no problem to "manually" download the resource
> using 'download.file()' and loading it with
> 'GenomicAlignments::readGAlignments()':
>
> download.file("https://experimenthub.bioconductor.org/fetch/8129",
> "file.bam")
> trying URL 'https://experimenthub.bioconductor.org/fetch/8129'
> Content type 'application/octet-stream' length 13296358 bytes (12.7 MB)
> ==================================================
> downloaded 12.7 MB
>
> gal <- GenomicAlignments::readGAlignments("file.bam")
> gal[1:3]
> GAlignments object with 3 alignments and 0 metadata columns:
> seqnames strand cigar qwidth start end width
> <Rle> <Rle> <character> <integer> <integer> <integer> <integer>
> [1] chr1 + 49M1S 50 16208 16256 49
> [2] chr1 + 3S47M 50 16976 17022 47
> [3] chr1 - 10M177N40M 50 17046 17272 227
> njunc
> <integer>
> [1] 0
> [2] 0
> [3] 1
> -------
> seqinfo: 2580 sequences from an unspecified genome
>
> any hint why 'httr::GET()' fails, while 'download.file()' doesn't?
>
> thanks!!
>
> robert.
> ps: just to clarify, the 'httr::GET()' example is behind the following
> problem:
>
> eh <- ExperimentHub()
> z <- eh[["EH8079"]]
> see ?gDNAinRNAseqData and browseVignettes('gDNAinRNAseqData') for
> documentation
> downloading 2 resources
> retrieving 2 resources
> |======================================================================|
> 100%
>
> Error: failed to load resource
> name: EH8079
> title: RNA-seq data BAM file subset of HRR589632 contaminated with 0%
> gDNA
> reason: 1 resources failed to download
> In addition: Warning messages:
> 1: download failed
> web resource path:
> ‘https://experimenthub.bioconductor.org/fetch/8129’
> <https://secure-web.cisco.com/1G9U1udOgqvil7BzSrk1HB2QvPNNeRPXidZLvh_epNXLPv1TrhUqn08C9P35HGdtTOb7o618WNCTyiVyN33-XUDlHCBdrEge6kXsqOKgSLtQvTHIAy-lStrk-VCkYpHvBPBmBnsfje9oWlLBS3j_GHaZhn97VjWPhVuy-Dmaf2COELmWHmMNGFKsbPFgrf9c1uASwhF8epk0meG_S_IDryWy2EhVlyNGlVjBrkp6aeXox1IKgdVUV4h_1Q3moBEJ7FXMDzCUtfHd7zJDkhSL7Bf81pLeAlTWkC0lVAVXTKS6egI4Q-0-6mFXz7ui7zJM6/https%3A%2F%2Fexperimenthub.bioconductor.org%2Ffetch%2F8129%E2%80%99>
> local file path: ‘/home/rcastelo/.cache/R/ExperimentHub/12ba1aa03_8129’
> reason: Failed writing received data to disk/application
> 2: bfcadd() failed; resource removed
> rid: BFC3
> fpath: ‘https://experimenthub.bioconductor.org/fetch/8129’
> <https://secure-web.cisco.com/1G9U1udOgqvil7BzSrk1HB2QvPNNeRPXidZLvh_epNXLPv1TrhUqn08C9P35HGdtTOb7o618WNCTyiVyN33-XUDlHCBdrEge6kXsqOKgSLtQvTHIAy-lStrk-VCkYpHvBPBmBnsfje9oWlLBS3j_GHaZhn97VjWPhVuy-Dmaf2COELmWHmMNGFKsbPFgrf9c1uASwhF8epk0meG_S_IDryWy2EhVlyNGlVjBrkp6aeXox1IKgdVUV4h_1Q3moBEJ7FXMDzCUtfHd7zJDkhSL7Bf81pLeAlTWkC0lVAVXTKS6egI4Q-0-6mFXz7ui7zJM6/https%3A%2F%2Fexperimenthub.bioconductor.org%2Ffetch%2F8129%E2%80%99>
> reason: download failed
> 3: download failed
> hub path: ‘https://experimenthub.bioconductor.org/fetch/8129’
> <https://secure-web.cisco.com/1G9U1udOgqvil7BzSrk1HB2QvPNNeRPXidZLvh_epNXLPv1TrhUqn08C9P35HGdtTOb7o618WNCTyiVyN33-XUDlHCBdrEge6kXsqOKgSLtQvTHIAy-lStrk-VCkYpHvBPBmBnsfje9oWlLBS3j_GHaZhn97VjWPhVuy-Dmaf2COELmWHmMNGFKsbPFgrf9c1uASwhF8epk0meG_S_IDryWy2EhVlyNGlVjBrkp6aeXox1IKgdVUV4h_1Q3moBEJ7FXMDzCUtfHd7zJDkhSL7Bf81pLeAlTWkC0lVAVXTKS6egI4Q-0-6mFXz7ui7zJM6/https%3A%2F%2Fexperimenthub.bioconductor.org%2Ffetch%2F8129%E2%80%99>
> cache resource: ‘EH8079 : 8129’
> reason: bfcadd() failed; see warnings()
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Medicine and Life Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list