[R] Fast way to determine number of lines in a file

kMan kchamberln at gmail.com
Wed Feb 10 04:59:52 CET 2010


It depends on the type of file and your system. 'count.fields()' is
impractical for large files because it generates a matrix with the same
number of dimensions as the file. It would be easier to use scan() with the
delimiter argument set up to read to the end of line marker, "\n" I believe,
and the 'what' argument set to a null list, so nothing is actually read.
Scan will still report the number of lines read. 

For flat files, and in windows, additional utilities installed with RTOOLS
(just need the tools-Cygwin dlls install) are the fastest that I know of. 

if(.Platform$OS.type=="windows"){ 
  system.time({ 
    cmd<-system(paste("/RTools/bin/wc -l","much_data.bin"), intern=TRUE) 
    cmd<-strsplit(cmd, " ")[[1]][1] 
    }) 
 }

Sincerely,
KeithC.

-----Original Message-----
From: Hadley Wickham [mailto:hadley at rice.edu] 
Sent: Monday, February 08, 2010 7:16 AM
To: R-help
Subject: [R] Fast way to determine number of lines in a file

Hi all,

Is there a fast way to determine the number of lines in a file?  I'm looking
for something like count.lines analogous to count.fields.

Hadley

--
http://had.co.nz/



More information about the R-help mailing list