[Rd] improving the performance of install.packages
Pages, Herve
hp@ge@ @end|ng |rom |redhutch@org
Sat Nov 9 01:10:18 CET 2019
Sounds a very reasonable approach to me.
H.
On 11/8/19 15:17, Henrik Bengtsson wrote:
> I believe introducing a backward compatible force=TRUE is a good
> start, even if we're not ready for making force=FALSE the default at
> this point. It would help simplify quite-common instructions like:
>
> if (requireNamespace("BiocManager"))
> install.packages("BiocManager")
> BiocManager::install(...)
>
> to
>
> install.packages("BiocManager", force=FALSE)
> BiocManager::install(...)
>
> and more so when installing lots of packages conditionally, e.g.
>
> if (requireNamespace("foo")) install.packages("foo")
> if (requireNamespace("bar")) install.packages("bar")
> ...
>
> to
>
> install.packages(c("foo", "bar", ...), force = FALSE)
>
> Before deciding on making force=FALSE the new default, I think it
> would be valuable to play the devil's advocate and explore and
> identify all possible downsides of such a default, e.g. breaking
> existing instructions, downstream package code that uses
> install.packages() internally, and so on.
>
> /Henrik
>
> PS. Although the idea of having update.packages() install missing
> packages is not bad, I don't think I'm a not a fan for the sole
> purpose of risking installation instructions starting using
> update.packages() instead, which will certainly confuse those who
> don't know the history (think require() vs library()).
>
> On Fri, Nov 8, 2019 at 3:11 PM Pages, Herve <hpages using fredhutch.org> wrote:
>>
>> Hi Gabe,
>>
>> Keeping track of where a package was installed from would be a nice
>> feature. However it wouldn't be as reliable as comparing hashes to
>> decide whether a package needs re-installation or not.
>>
>> H.
>>
>> On 11/8/19 12:37, Gabriel Becker wrote:
>>> Hi Josh,
>>>
>>> There are a few issues I can think of with this. The primary one is that
>>> CRAN(/Bioconductor) is not the only place one can install packages from. I
>>> might have version x.y.z of a package installed that was, at the time, a
>>> development version I got from github, or installed locally, etc. Hell I
>>> might have a later devel version but want the CRAN version. Not common,
>>> sure, but wiill likely happen often enough that install.packages not doing
>>> that for me when I tell it to is probably bad.
>>>
>>> Currently (though there has been some discussion of changing this) packages
>>> do not remember where they were installed from, so R wouldn't know if the
>>> version you have is actually fully the same one on the repository you
>>> pointed install.packages to or not. If that were changed and we knew that
>>> we were getting the byte identical package from the actual same source, I
>>> think this would be a nice addition, though without it I think it would be
>>> right a high but not high enough proportion of the time.
>>>
>>> R will build the package from source (depending on what OS you're using)
>>>> twice by default. This becomes especially burdensome when people are using
>>>> big packages (i.e. lots of depends) and someone has a script with:
>>>>
>>>
>>>
>>> install.packages("tidyverse")
>>>> ...
>>>> ... later on down the script
>>>> ...
>>>> install.packages("dplyr")
>>>>
>>>
>>> I mean, IMHO and as I think Duncan was alluding to, that's straight up an
>>> error by the script author. I think its a few of them, actually, but its at
>>> least one. An understandable one, sure, but thats still what it is. Scripts
>>> (which are meant to be run more than once, generally) usually shouldn't
>>> really be calling install.packages in the first place, but if they do, they
>>> should certainly not be installing umbrella packages and the packages they
>>> bring with them separately.
>>>
>>> Even having one vectorized call to install.packages where all the packages
>>> are installed would prevent this issue, including in the case where the
>>> user doesn't understand the purpose of the tidyverse package. Though the
>>> installation would still occur every time the script was run.
>>>
>>>
>>> The last thing to note is that there are at least 2 packages which provide
>>> a function which does this already (install.load and remotes), so people
>>> can get this functionality if they need it.
>>>
>>>
>>> On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 using gmail.com> wrote:
>>>
>>>>
>>>>
>>>> I assumed this list is used to discuss proposals like this to the R
>>>> codebase. If I'm on the wrong list, please let me know.
>>>>
>>>
>>> This is the right place to discuss things like this. Thanks for starting
>>> the conversation.
>>>
>>> Best,
>>> ~G
>>>
>>>>
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=XG4gVQKZam41YLfI3w8XRAu8s7f2I5jCppA45q6NBu0&s=cOXQGMA9Va3o9x1USGggzF82D1LtFQb2ALpLRLQs2k4&e=
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages using fredhutch.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=fGJJxDES27LnpzyoNVndAepN8xSbeWQ7mB48xpQ-5UU&s=OQXCqMhgyQJDnh8FbLqcbXNHOXbd3F1uDWvKDS6Fk3s&e=
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages using fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list