[Rd] download.file does not process gz files correctly (truncates them?)

Joris Meys jori@mey@ @ending from gm@il@com
Fri May 4 10:00:07 CEST 2018


On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera <tomas.kalibera at gmail.com>
wrote:

> The current heuristic/hack is in line with the compatibility approach: it
> detects files that are obviously binary, so it changes the default behavior
> only for cases when it would obviously cause damage.
>
> Tomas


Well, I was trying to download a .gz file and download.file() didn't detect
that. Reason for that is obviously that the link doesn't contain .gz but
%2Egz , using the ASCII code for the dot instead of the dot itself. That's
general practice in a lot of links.

Hence I propose to change the line in download.file() that does this check
to:

  if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
                                   URLdecode(url))))

using URLdecode() ensures that .gz, .RData etc will be detected correctly
in an encoded URL.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]




More information about the R-devel mailing list