[Rd] Duplicated mirrors on available packages
Maxim Nazarov
m@x|m@n@z@rov @end|ng |rom open@n@|yt|c@@eu
Mon Sep 12 10:57:43 CEST 2022
If you profile the second run, you will see that the majority of the time is spent in the `tools:::.remove_stale_dups` function, which loops over all duplicated packages - so all packages in that case.
One improvement I could think of is to replace the first line of that function
pkgs <- ap[, "Package"]
with
pkgs <- ap[!duplicated(ap[, c("Package", "Version")]), "Package"]
which would help in your example, but the function might still run longer if there are many packages with different versions present, so there maybe even better optimizations.
Stating the obvious here, but since we don't know your 'real' use case, adding a `unique` call to the `repos` argument of the `available.packages` would achieve a similar improvement without any modifications needed from `tools`.
Kind regards,
Maxim Nazarov
----- Original Message -----
From: "Colin Gillespie" <csgillespie using gmail.com>
To: "r-devel" <r-devel using r-project.org>
Sent: Friday, September 9, 2022 7:33:09 PM
Subject: [Rd] Duplicated mirrors on available packages
Hi
When there are duplicated repos, available.packages() takes
significantly longer to run.
For example
mirror = "https://cloud.r-project.org/"
system.time(available.packages(repos = mirror))
# user system elapsed
# 1.054 0.031 1.245
system.time(available.packages(repos = c(mirror, mirror)))
# user system elapsed
# 22.389 0.037 22.429
Best wishes,
Colin
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0
Dr Colin Gillespie
https://twitter.com/csgillespie
______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list