[Rd] write.csv performance improvements?

Toby Hocking tdhock5 @end|ng |rom gm@||@com
Thu Mar 30 07:24:13 CEST 2023


Dear R-devel,
I did a systematic comparison of write.csv with similar functions, and
observed two asymptotic inefficiencies that could be improved.

1. write.csv is quadratic time (N^2) in the number of columns N.
Can write.csv be improved to use a linear time algorithm, so it can handle
CSV files with larger numbers of columns?
For more details including figures and session info, please see
https://github.com/tdhock/atime/issues/9

2. write.csv uses memory that is linear in the number of rows, whereas
similar R functions for writing CSV use only constant memory. This is not
as important of an issue to fix, because anyway linear memory is used to
store the data in R. But since the other functions use constant memory,
could write.csv also? Is there some copying happening that could be
avoided? (this memory measurement uses bench::mark, which in turn uses
utils::Rprofmem)
https://github.com/tdhock/atime/issues/10

Sincerely,
Toby Dylan Hocking

	[[alternative HTML version deleted]]



More information about the R-devel mailing list