[Rd] download.file does not process gz files correctly (truncates them?)

Hadley Wickham h@wickh@m @ending from gm@il@com
Tue May 8 17:15:43 CEST 2018


On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
<tomas.kalibera at gmail.com> wrote:
> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:
>>
>> Also, as mentioned in my
>> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
>> not specifying the mode argument, the default on Windows is mode = "w"
>> *except* for certain, case-sensitive, filename extensions:
>>
>>      if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
>> url)))
>>          mode <- "wb"
>>
>> Just like the need for mode = "wb" on Windows, the above
>> special-file-extension-hack is only happening on Windows, and is only
>> documented in ?download.file if you're on Windows; so someone who's on
>> Linux/macOS trying to help someone on Windows may not be aware of
>> this. This adds to even more confusions, e.g. "works for me".
>
> If we were designing the API today, it would probably make more sense not to
> convert any line endings by default. Today's editors _usually_ can cope with
> different line endings and it is probably easier to detect that a text file
> has incorrect line endings rather than detecting that a binary file has been
> corrupted by an attempt to convert line endings. But whether to change
> existing, documented behavior is a different question. In order to help
> users and programmers who do not read the documentation carefully we would
> create problems for users and programmers who do. The current heuristic/hack
> is in line with the compatibility approach: it detects files that are
> obviously binary, so it changes the default behavior only for cases when it
> would obviously cause damage.

>From a purely utilitarian standpoint, there are far more users who do
not carefully read the documentation than users who do ;)

(I'd also argue that basing the decision on the file extension is
suboptimal, and it would be better to use the mime type if provided by
the server)

Hadley

-- 
http://hadley.nz



More information about the R-devel mailing list