[R] [External] Re: read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2?
Rasmus Liland
jr@| @end|ng |rom po@teo@no
Fri Jul 17 15:23:43 CEST 2020
On 2020-07-17 07:54 -0400, Sam H wrote:
| On 2020-07-17 09:30 +0100, ruipbarradas wrote:
| | On 2020-07-16 20:59 -0500, luke-tierney using uiowa.edu wrote:
| | | Às 08:45 de 15/07/20, Sam H escreveu:
| | | | Hi,
| | | |
| | | | I am trying to download some
| | | | data using read.csv and it works
| | | | perfectly in RStudio and fails
| | | | in the R console in the terminal
| | | | in Ubuntu 18.04 after upgrading
| | | | from R 3.6.3 to 4.0.2.
| | |
| | | On my Ubuntu system the download
| | | with read.csv succeeds in an R
| | | console if I set the HTTPUserAgent
| | | and download.file.method options to
| | | match the ones used by RStudio.
| | |
| | | Given how picky the server is being
| | | I would worry about whether this use
| | | is in line with the site's terms of
| | | service.
| |
| | Yes, I thought it's a site policy
| | issue too. But the file can be
| | accessed and read/downloaded from
| | RStudio and Firefox so apparently
| | there's no reason why R console
| | shouldn't .
|
| Hello,
|
| Thank you very much to you all to look into this.
|
| I came across this problem when I was using TTR::stockSymbols() (
| https://github.com/joshuaulrich/TTR/blob/e6609b9f7621f3a4b1a204c159af61aebc89997e/R/WebData.R)
| .
|
| As a workaround I added this function
| to my private R package and instead of
| read.csv I am now using
| data.table::fread() which properly
| (without failing) downloads the file
| and reads it.
Dear Sam,
Good thing you solved this.
Like Luke said, to use read.csv you need
to set the HTTPUserAgent option:
options("HTTPUserAgent"="User-Agent: RStudio Desktop (1.3.959)")
... or with cURL directly:
rasmus using twentyfive ~ % curl -H 'User-Agent: RStudio Desktop (1.3.959)' 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
Às 08:45 de 15/07/20, Sam H escreveu:
| Before upgrading this worked in the R
| console in the terminal also without
| any issues.
In version 3.6.3, I was not able to
run the lines
> R.Version()$version.string
[1] "R version 3.6.3 (2020-02-29)"
> options()[c("download.file.method", "HTTPUserAgent")]
$<NA>
NULL
$HTTPUserAgent
[1] "R (3.6.3 x86_64-pc-linux-gnu x86_64 linux-gnu)"
> x<-"https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
> read.csv(x, as.is=TRUE, na="n/a")
Error in file(file, "rt") :
cannot open the connection to 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
In addition: Warning message:
In file(file, "rt") :
cannot open URL 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download': HTTP status was '403 Forbidden'
>
Running data.table::fread in 4.0.2:
> options()[c("download.file.method", "HTTPUserAgent")]
$<NA>
NULL
$HTTPUserAgent
[1] "R (4.0.2 x86_64-pc-linux-gnu x86_64 linux-gnu)"
> x <- "https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
> data.table::fread(x, header=TRUE)[1:2,]
Symbol Name LastSale
1: TXG 10x Genomics, Inc. 89.19
2: YI 111, Inc. 6.53
MarketCap IPOyear Sector
1: $8.77B 2019 Capital Goods
2: $537.81M 2018 Health Care
industry
1: Biotechnology: Laboratory Analytical Instruments
2: Medical/Nursing Services
Summary Quote V9
1: https://old.nasdaq.com/symbol/txg NA
2: https://old.nasdaq.com/symbol/yi NA
Does anyone know what data.table::fread
does different to read.csv here (so
setting HTTPUserAgent is not needed)?
Without HTTPUserAgent, I think
data.table::fread just reports something
like "libcurl/7.71.1", like read.csv
would have done ...
Best,
Rasmus
More information about the R-help
mailing list