[Rd] Including mode='wb' in download.file() for .xlsx files on Windows ?
Avraham Adler
@vr@h@m@@d|er @end|ng |rom gm@||@com
Sun Aug 10 20:52:49 CEST 2025
If I recall correctly, xlsx files are XML. It is the xls/xlsb files which are binary.
https://learn.microsoft.com/en-us/openspecs/office_standards/ms-xlsx/2c5dee00-eff2-4b22-92b6-0738acd4475e
Sent from my iPhone
> On Aug 10, 2025, at 2:38 PM, Paul McQuesten <mcquesten using gmail.com> wrote:
>
> Perhaps it would be simpler, and more future-proof, for R to always
> download as binary.
> Are there any modern consumers of text files that are bothered by '\r\n'?
> Or even Macintosh '\r' line terminators?
>
>> On Sun, Aug 10, 2025 at 1:22 PM Hernando Cortina <hch using alum.mit.edu> wrote:
>>
>> Yes, .docx and .pptx are part of the same specification.
>>
>>
>>
>> Kind regards
>>
>> Hernando
>>
>>
>>
>> *From: *Paul McQuesten <mcquesten using gmail.com>
>> *Date: *Sunday, August 10, 2025 at 1:34 PM
>> *To: *Hernando Cortina <hch using alum.mit.edu>
>> *Subject: *Re: [Rd] Including mode='wb' in download.file() for .xlsx
>> files on Windows ?
>>
>> IIUC, '.docx' files are also binary?
>>
>>
>>
>> On Sun, Aug 10, 2025 at 11:29 AM Hernando Cortina <hcortina71 using gmail.com>
>> wrote:
>>
>> Hello all, regarding download.file():
>>
>> On Windows, if mode is not supplied (missing()) and url ends in one of
>> ‘.gz’, ‘.bz2’, ‘.xz’, ‘.tgz’, ‘.zip’, ‘.jar’, ‘.rda’,
>> ‘.rds’, ‘.RData’ or ‘.pdf’, mode = "wb" is set so that a binary
>> transfer is done to help unwary users.
>>
>> May I suggest possibly including .xlsx files to the list of extensions
>> that get this treatment?
>>
>> Downloading such files may be a quite common activity in the R
>> community and having to manually add mode=”wb” may indeed catch
>> Windows users unaware, particularly if they are coming from Linux or
>> Mac where this is not necessary.
>>
>> I understand that it’s hard to know when to stop when adding
>> additional extensions. That said, .xlsx is quite ubiquitous in the
>> wild and standardized under ECMA-376.
>>
>> I hope this might be helpful to others, and thank you for your
>> consideration.
>> Hernando
>> ---------------
>>
>> The change in src/library/utils/R/Windows/download.file.R would be:
>>
>> …
>>
>> if(missing(mode) &&
>> length(grep("\\.(gz|bz2|xz|tgz|zip|jar|rd[as]|RData|xlsx)$",
>>
>> URLdecode(url))))
>>
>> mode <- "wb"
>>
>> …
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
More information about the R-devel
mailing list