[Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)
Kevin Ushey
kevinushey at gmail.com
Tue Aug 25 23:41:28 CEST 2015
In fact, this does reproduce on R-devel:
> options(download.file.method = "libcurl")
> options(repos = c(CRAN = "https://cran.rstudio.com/", CRANextra =
+ "http://www.stats.ox.ac.uk/pub/RWin"))
> install.packages("lattice") ## could be any package
Installing package into ‘/Users/kevinushey/Library/R/3.3/library’
(as ‘lib’ is unspecified)
Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
> sessionInfo()
R Under development (unstable) (2015-08-14 r69078)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)
I think this could be problematic for users with custom CRAN
repositories. For example, if I have a CRAN repository that only
serves source packages (no binary packages), this implies that any R
session configured to download binary packages would fail to download
any packages at all (as it would barf on attempting to read the
non-existent PACKAGES file for the 'binary' branch of the custom
repository).
This can also be seen by attempting to install a package using current
R-devel (since no binaries are made available for R 3.3):
> options(download.file.method = "libcurl")
> options(repos = c(CRAN = "https://cran.rstudio.com/"))
> print(getOption("pkgType"))
[1] "both"
> install.packages("lattice")
Installing package into ‘/Users/kevinushey/Library/R/3.3/library’
(as ‘lib’ is unspecified)
Error in install.packages : Line starting '<!DOCTYPE HTML PUBLI
...' is malformed!
The same error (with a different, XML response) is returned when using
e.g. `https://cran.fhcrc.org`.
Kevin
On Tue, Aug 25, 2015 at 1:33 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
> On 08/25/2015 01:30 PM, Kevin Ushey wrote:
>>
>> Hi Martin,
>>
>> Indeed it does (and I should have confirmed myself with R-patched and
>> R-devel
>> before posting...)
>
>
> actually I don't know that it does -- it addresses the symptom but I think
> there should be an error from libcurl on the 403 / 404 rather than from
> read.dcf on error page...
>
> Martin
>
>
>>
>> Thanks, and sorry for the noise.
>> Kevin
>>
>>
>> On Tue, Aug 25, 2015, 13:11 Martin Morgan <mtmorgan at fredhutch.org
>> <mailto:mtmorgan at fredhutch.org>> wrote:
>>
>> On 08/25/2015 12:54 PM, Kevin Ushey wrote:
>> > Hi all,
>> >
>> > The following fails for me (on OS X, although I imagine it's the
>> same
>> > on other platforms using libcurl):
>> >
>> > options(download.file.method = "libcurl")
>> > options(repos = c(CRAN = "https://cran.rstudio.com/",
>> CRANextra =
>> > "http://www.stats.ox.ac.uk/pub/RWin"))
>> > install.packages("lattice") ## could be any package
>> >
>> > gives me:
>> >
>> > > options(download.file.method = "libcurl")
>> > > options(repos = c(CRAN = "https://cran.rstudio.com/",
>> CRANextra
>> > = "http://www.stats.ox.ac.uk/pub/RWin"))
>> > > install.packages("lattice") ## coudl be any package
>> > Installing package into
>> ‘/Users/kevinushey/Library/R/3.2/library’
>> > (as ‘lib’ is unspecified)
>> > Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>> >
>> > This seems to come from a call to `available.packages()` to a URL
>> that
>> > doesn't exist on the server (likely when querying PACKAGES on the
>> > CRANextra repo)
>> >
>> > Eg.
>> >
>> > > URL <- "http://www.stats.ox.ac.uk/pub/RWin"
>> > > available.packages(URL, method = "internal")
>> > Warning: unable to access index for repository
>> > http://www.stats.ox.ac.uk/pub/RWin
>> > Package Version Priority Depends Imports LinkingTo
>> Suggests
>> > Enhances License License_is_FOSS
>> > License_restricts_use OS_type Archs MD5sum
>> NeedsCompilation
>> > File Repository
>> > > available.packages(URL, method = "libcurl")
>> > Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>> >
>> > It looks like libcurl downloads and retrieves the 403 page itself,
>> > rather than reporting that it was actually forbidden, e.g.:
>> >
>> > >
>>
>> download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz",
>> > tempfile(), method = "libcurl")
>> > trying URL
>>
>> 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz'
>> > Content type 'text/html; charset=iso-8859-1' length 339 bytes
>> > ==================================================
>> > downloaded 339 bytes
>> >
>> > Using `method = "internal"` gives an error related to the inability
>> to
>> > access that URL due to the HTTP status 403.
>> >
>> > The overarching issue here is that package installation shouldn't
>> fail
>> > even if libcurl fails to access one of the repositories set.
>> >
>>
>> With
>>
>> > R.version.string
>> [1] "R version 3.2.2 Patched (2015-08-25 r69179)"
>>
>> the behavior is to warn with an indication of the repository for which
>> the
>> problem occurs
>>
>> > URL <- "http://www.stats.ox.ac.uk/pub/RWin"
>> > available.packages(URL, method="libcurl")
>> Warning: unable to access index for repository
>> http://www.stats.ox.ac.uk/pub/RWin:
>> Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>> Package Version Priority Depends Imports LinkingTo Suggests
>> Enhances
>> License License_is_FOSS License_restricts_use OS_type Archs
>> MD5sum
>> NeedsCompilation File Repository
>> > available.packages(URL, method="internal")
>> Warning: unable to access index for repository
>> http://www.stats.ox.ac.uk/pub/RWin:
>> cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES'
>> Package Version Priority Depends Imports LinkingTo Suggests
>> Enhances
>> License License_is_FOSS License_restricts_use OS_type Archs
>> MD5sum
>> NeedsCompilation File Repository
>>
>> Does that work for you / address the problem?
>>
>> Martin
>>
>> >> sessionInfo()
>> > R version 3.2.2 (2015-08-14)
>> > Platform: x86_64-apple-darwin13.4.0 (64-bit)
>> > Running under: OS X 10.10.4 (Yosemite)
>> >
>> > locale:
>> > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>> >
>> > attached base packages:
>> > [1] stats graphics grDevices utils datasets methods
>> base
>> >
>> > other attached packages:
>> > [1] testthat_0.8.1.0.99 knitr_1.11 devtools_1.5.0.9001
>> > [4] BiocInstaller_1.15.5
>> >
>> > loaded via a namespace (and not attached):
>> > [1] httr_1.0.0 R6_2.0.0.9000 tools_3.2.2 parallel_3.2.2
>> whisker_0.3-2
>> > [6] RCurl_1.95-4.1 memoise_0.2.1 stringr_0.6.2 digest_0.6.4
>> evaluate_0.7.2
>> >
>> > Thanks,
>> > Kevin
>> >
>> > ______________________________________________
>> > R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
More information about the R-devel
mailing list