[Bioc-devel] read.table fails with https protocol

Christian Mertes merte@ @end|ng |rom |n@tum@de
Tue Sep 17 16:16:00 CEST 2019


Hey Bioc-devel community,

My package OUTRIDER fails again sometimes on the build system but rather
randomly. First I thought it was due to the ImageMagick problem I posted
some days ago. But this is really only a warning.

I guess I found the problem. But this I dont really understand. Any help
is appreciated.

I assume from the docs that *read.table* works for http and https. But
on the build system and also locally sometimes this fails with the error:

Error in read.table(URL, sep = "\t") : no lines available in input

I digged into it a bit and it looks like *readLines* has problems
reading from https connections. See below my examples:

library(data.table)
library(utils)
library(curl)

# Link to a count table in TSV format at nature.com
URL <-
"media.nature.com/original/nature-assets/ncomms/2017/170612/ncomms15824/extref/ncomms15824-s1.txt"

# Fails with https
read.table(paste0("https://", URL), sep="\t", nrows=10)[,1:10]

# Works with plain http
read.table(paste0("http://", URL),  sep="\t", nrows=10)[,1:10]

# Works if using curl to read lines first
read.table(text=readLines(curl(paste0("https://", URL))), sep="\t",
nrows=10)[,1:10]

# Fails if using only readLines
read.table(text=readLines(paste0("https://", URL)), sep="\t",
nrows=10)[,1:10]

# Works with fread from data.table package (it uses curl to dump first
the file)
data.frame(fread(paste0("https://", URL), sep="\t", nrows=10)[,1:10],
row.names=1)
data.frame(fread(paste0("http://", URL),  sep="\t", nrows=10)[,1:10],
row.names=1)

I guess my solution is to use http or move to use fread or curl. But I
think the clean way is to use read.table or?

Best,

Christian

-- 

Christian Mertes
PhD Student / Lab Administrator
Gagneur lab
 
Computational Genomics
I12 - Bioinformatics Department
Technical University Munich
Boltzmannstr. 3
85748 Garching, Germany

Mail: mertes using in.tum.de
Phone: +49-89-289-19416
http://gagneurlab.in.tum.de


	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list