[Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

Kevin Ushey kevinushey at gmail.com
Tue Aug 25 21:54:43 CEST 2015


Hi all,

The following fails for me (on OS X, although I imagine it's the same
on other platforms using libcurl):

    options(download.file.method = "libcurl")
    options(repos = c(CRAN = "https://cran.rstudio.com/", CRANextra =
"http://www.stats.ox.ac.uk/pub/RWin"))
    install.packages("lattice") ## could be any package

gives me:

    > options(download.file.method = "libcurl")
    > options(repos = c(CRAN = "https://cran.rstudio.com/", CRANextra
= "http://www.stats.ox.ac.uk/pub/RWin"))
    > install.packages("lattice") ## coudl be any package
    Installing package into ‘/Users/kevinushey/Library/R/3.2/library’
    (as ‘lib’ is unspecified)
    Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!

This seems to come from a call to `available.packages()` to a URL that
doesn't exist on the server (likely when querying PACKAGES on the
CRANextra repo)

Eg.

    > URL <- "http://www.stats.ox.ac.uk/pub/RWin"
    > available.packages(URL, method = "internal")
    Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin
         Package Version Priority Depends Imports LinkingTo Suggests
Enhances License License_is_FOSS
        License_restricts_use OS_type Archs MD5sum NeedsCompilation
File Repository
    > available.packages(URL, method = "libcurl")
    Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!

It looks like libcurl downloads and retrieves the 403 page itself,
rather than reporting that it was actually forbidden, e.g.:

    > download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz",
tempfile(), method = "libcurl")
    trying URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz'
    Content type 'text/html; charset=iso-8859-1' length 339 bytes
    ==================================================
    downloaded 339 bytes

Using `method = "internal"` gives an error related to the inability to
access that URL due to the HTTP status 403.

The overarching issue here is that package installation shouldn't fail
even if libcurl fails to access one of the repositories set.

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] testthat_0.8.1.0.99  knitr_1.11           devtools_1.5.0.9001
[4] BiocInstaller_1.15.5

loaded via a namespace (and not attached):
 [1] httr_1.0.0     R6_2.0.0.9000  tools_3.2.2    parallel_3.2.2 whisker_0.3-2
 [6] RCurl_1.95-4.1 memoise_0.2.1  stringr_0.6.2  digest_0.6.4   evaluate_0.7.2

Thanks,
Kevin



More information about the R-devel mailing list