[Rd] Faster downloads: avoid them if possible
Lluís Revilla
||u|@@rev|||@ @end|ng |rom gm@||@com
Tue Dec 10 00:35:01 CET 2024
Dear R-devel,
I read with interest the recent blog post on how R will have parallel
downloads, on blog.r-project.org
(https://blog.r-project.org/2024/12/02/faster-downloads/index.html).
Thanks Tomas!
The blog mentions that one of the areas where this will be observed is
while installing them (which I did!). However, I noticed they might be
downloaded multiple times:
If one interrupts the install.packages (via Ctrl+C), or it fails due
to some system dependency missing and I fix that on a different
terminal session, or the internet connection is cut and I try again.
One possible way to make installations/downloads faster and also
reduce the bandwidth of repositories (and its mirrors) would be to
check if they need to be downloaded (again).
PACKAGES file on <repo>/src/contrib includes the MD5sum field that
could be used to check packages on the local folder (But it might be
faster to first check if any file exists there for the same package).
In short, I propose:
1) Checking before downloading packages their existence on the destdir
directory used by install.packages.
2) I suppose the most common scenario is to use install.packages with
the default destdir parameter (NULL). If 1) is implemented it might be
useful to keep the temporary directory common for a single R session.
I would appreciate feedback on these ideas.
Best,
Lluís Revilla
PD: New users encountering download & installation issues often keep
seeing the progress bar (and in the future "trying URL 'https://...")
of the same packages. There are some ways to prevent/avoid repeated
downloads, such as, using the system library dependency resolver, or
having local mirrors. But they are not easy/available for new useRs,
and sometimes they are difficult to avoid (like having a reliable
internet connection).
More information about the R-devel
mailing list