[Rd] download.file does not process gz files correctly (truncates them?)

Hadley Wickham h@wickh@m @ending from gm@il@com
Tue May 8 17:15:43 CEST 2018

On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
<tomas.kalibera at gmail.com> wrote:
> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:
>> Also, as mentioned in my
>> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
>> not specifying the mode argument, the default on Windows is mode = "w"
>> *except* for certain, case-sensitive, filename extensions:
>>      if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
>> url)))
>>          mode <- "wb"
>> Just like the need for mode = "wb" on Windows, the above
>> special-file-extension-hack is only happening on Windows, and is only
>> documented in ?download.file if you're on Windows; so someone who's on
>> Linux/macOS trying to help someone on Windows may not be aware of
>> this. This adds to even more confusions, e.g. "works for me".
> If we were designing the API today, it would probably make more sense not to
> convert any line endings by default. Today's editors _usually_ can cope with
> different line endings and it is probably easier to detect that a text file
> has incorrect line endings rather than detecting that a binary file has been
> corrupted by an attempt to convert line endings. But whether to change
> existing, documented behavior is a different question. In order to help
> users and programmers who do not read the documentation carefully we would
> create problems for users and programmers who do. The current heuristic/hack
> is in line with the compatibility approach: it detects files that are
> obviously binary, so it changes the default behavior only for cases when it
> would obviously cause damage.

>From a purely utilitarian standpoint, there are far more users who do
not carefully read the documentation than users who do ;)

(I'd also argue that basing the decision on the file extension is
suboptimal, and it would be better to use the mime type if provided by
the server)



More information about the R-devel mailing list