[R] Least error-prone reading of Excel files?

Thu May 16 16:14:39 CEST 2024

https://www.r-bloggers.com/2021/06/reading-data-from-excel-files-xlsxlsxcsv-into-r-quick-guide/

Excel can hold a great quantity of data. However, I find that it is slow and often crashes when I try to use Excel at large scale. It also grinds my entire system to a halt. At the kb and mb scales I typically have few problems. At gb scales Excel will hold the data, but doing anything with it is problematic (for me). I have used readxl and associated read_excel() in R and not noticed issues at my small scales. I could read a file multiple times in different data frames and then compare them but that too is slow and can exceed system resources. 

I only deal with a few files, so I would use something like 7-zip to decompress the files before having R read them. I would bet that there are existing programs that would unzip large batches of files, but I have never had to do this where the target files are scattered amongst other files that are not needed. If I can use "select all" then that is simple enough.

Tim

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of DynV Montrealer
Sent: Thursday, May 16, 2024 9:51 AM
To: r-help using r-project.org
Subject: [R] Least error-prone reading of Excel files?

[External Email]

I'm tasked to read a table from an excel file and it doesn't mention which method to use. I went back some lessons ago and the 5 years old lesson mentioned to pick a package using the highest score the way of the attached (screenshot). Since there's no requirement of a method to read Excel files, I'd rather use the least error-prone one; what would that be? eg will try multiple decompression algorithm if there's a decompression error.

Thank you kindly