[Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Feb 5 13:17:41 CET 2024


>>>>> Simon Urbanek 
>>>>>     on Sun, 4 Feb 2024 10:33:34 +1300 writes:

    > Any reason why you didn't use quiet=TRUE to suppress that
    > output?  

He wrote 'quite' instead of 'quiet' {see cited below '1. quite=1'}
and probably never tried the correct spelling ...

    > There is no official API structure for
    > credentials in R repositories, so R has no way of knowing
    > which part of the URL are credentials as it is not under
    > R's purview - it could be part of the path or anything, so
    > there is no way R can reliably mask it. Hence it makes
    > more sense for the user to suppress the output if they
    > think it may contain sensitive information - and R
    > supports that.

    > If that's still not enough, then please make a concrete
    > proposal that defines exactly what kind processing you'd
    > like to see under what conditions - and how you think that
    > will solve the problem.

    > Cheers, Simon



    >> On Feb 2, 2024, at 5:28 AM, Xinyi <xinyi.xu97 using gmail.com>
    >> wrote:
    >> 
    >> Hi all,
    >> 
    >> When trying to install a package from R using
    >> install.packages(), it will print out the full url
    >> address (of the remote repository) it was trying to
    >> access. A bit further digging shows it is from the
    >> in_do_curlDownload method from R's libcurl
    >> <https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c>:
    >> install.packages() calls download.packages(), and
    >> download.packages() calls download.file(), which uses
    >> "libcurl" as its default method.
    >> 
    >> This line from R mirror
    >> <https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c#L772>
    >> ("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")
    >> prints the full url it is trying to access.
    >> 
    >> This is totally fine for public urls without credentials,
    >> but in the case that a given url contains an API key, it
    >> poses security issues. For example, if the
    >> getOption("repos") has been overridden to a customized
    >> repository (protected by API keys), then
    >>> install.packages("zoo")
    >> Installing packages into '--removed local directory
    >> path--' trying URL 'https://--removed userid--:--removed
    >> api-key-- using repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz
    >> ' Content type 'application/x-gzip' length 782344 bytes
    >> (764 KB) =================================== downloaded
    >> 764 KB
    >> 
    >> * installing *source* package 'zoo' ...  -- further logs
    >> removed --
    >>> 
    >> 
    >> I also tried several other options:
    >> 
    >> 1. quite=1
    >>> install.packages("zoo", quite=1)
    >> It did hide the url, but it also hid all other useful
    >> information.  2. method="curl"
    >>> install.packages("zoo", method="curl")
    >> This does not print the url when the download is
    >> successful, but if there were any errors, it still prints
    >> the url with API key in it.  3. method="wget"
    >>> install.packages("zoo", method="wget")
    >> This hides API key by *password*, but I wasn't able to
    >> install packages with this method even with public repos,
    >> with the error "Warning: unable to access index for
    >> repository https://cloud.r-project.org/src/contrib/4.3:
    >> 'wget' call had nonzero exit status"
    >> 
    >> 
    >> In other dynamic languages' package managers like
    >> Python's pip, API keys are hidden by default since pip
    >> 18.x in 2018, and masked by "****" from pip 19.x in 2019,
    >> see below examples. Can we get a similar default
    >> behaviour in R?
    >> 
    >> 1. with pip 10.x $ pip install numpy -v # API key was not
    >> hided Looking in indexes: https://--removed
    >> userid--:--removed
    >> api-key-- using repository-addresss.com:4443/.../pypi/simple
    >> 2. with pip 18.x # All credentials are removed by pip $
    >> pip install numpy -v Looking in indexes:
    >> https://repository-addresss.com:4443/ .../pypi/simple
    >> 3. with pip 19.x onwards # userid is kept, API key is
    >> replaced by **** $ pip install numpy -v Looking in
    >> indexes: https://userid:****@
    >> repository-addresss.com:4443/.../pypi/simple
    >> 
    >> 
    >> I was instructed by https://www.r-project.org/bugs.html
    >> that I should get some discussion on r-devel before
    >> filing a feature request. So looking forward to
    >> comments/suggestions.
    >> 

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list