[Rd] read.csv quadratic time in number of columns

Toby Hocking tdhock5 @end|ng |rom gm@||@com
Thu Mar 30 06:53:59 CEST 2023


Dear R-devel,
A number of people have observed anecdotally that read.csv is slow for
large number of columns, for example:
https://stackoverflow.com/questions/7327851/read-csv-is-extremely-slow-in-reading-csv-files-with-large-numbers-of-columns
I did a systematic comparison of read.csv with similar functions, and
observed that read.csv is quadratic time (N^2) in the number of columns N,
whereas the others are linear (N).
Can read.csv be improved to use a linear time algorithm, so it can handle
CSV files with larger numbers of columns?
For more details including figures and session info, please see
https://github.com/tdhock/atime/issues/8
Sincerely,
Toby Dylan Hocking

	[[alternative HTML version deleted]]



More information about the R-devel mailing list