[Rd] Duplicated mirrors on available packages

Maxim Nazarov m@x|m@n@z@rov @end|ng |rom open@n@|yt|c@@eu
Mon Sep 12 10:57:43 CEST 2022


If you profile the second run, you will see that the majority of the time is spent in the `tools:::.remove_stale_dups` function, which loops over all duplicated packages - so all packages in that case.
One improvement I could think of is to replace the first line of that function
    pkgs <- ap[, "Package"]
with
    pkgs <- ap[!duplicated(ap[, c("Package", "Version")]), "Package"]
which would help in your example, but the function might still run longer if there are many packages with different versions present, so there maybe even better optimizations.

Stating the obvious here, but since we don't know your 'real' use case, adding a `unique` call to the `repos` argument of the `available.packages` would achieve a similar improvement without any modifications needed from `tools`.

Kind regards,
Maxim Nazarov

----- Original Message -----
From: "Colin Gillespie" <csgillespie using gmail.com>
To: "r-devel" <r-devel using r-project.org>
Sent: Friday, September 9, 2022 7:33:09 PM
Subject: [Rd] Duplicated mirrors on available packages

Hi

When there are duplicated repos, available.packages() takes
significantly longer to run.

For example

mirror = "https://cloud.r-project.org/"
system.time(available.packages(repos = mirror))
#   user  system elapsed
# 1.054   0.031   1.245
system.time(available.packages(repos = c(mirror, mirror)))
#   user  system elapsed
# 22.389   0.037  22.429

Best wishes,

Colin


> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0


Dr Colin Gillespie
https://twitter.com/csgillespie

______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list