[R] count how many row i have in a txt file in a directory

Rui Barradas rui1174 at sapo.pt
Sun Feb 26 18:39:46 CET 2012


> The first step before to create a loop row-by-row is to know
> how many rows there are in the txt file without load in R to save memory
> problem.
> some people know the specific function? 

I don't believe there's a specific function.
If you want to know how many rows are there in a txt file, try this

numTextFileLines <- function(filename, header=FALSE, sep=",", nrows=5000){
	tc <- file(filename, open="rt")
		# cnames: column names (not used)
		cnames <- read.table(file=tc, sep=sep, nrows=1, stringsAsFactors=FALSE)
		# cnames <- as.character(cnames)
	n <- 0
		x <- tryCatch(read.table(file=tc, sep=sep, nrows=nrows), error=function(e)
		if (any(grepl("no lines available", unclass(x))))
		if(nrow(x) < nrows){
			n <- n + nrow(x)
		n <- n + nrows

# Make a data file
N <- 1e7 + 1
d <- data.frame(X=1:N, Y=sample(10, N, T), MyValue=rnorm(N))
write.table(d, file="test.txt", row.names=FALSE, sep=",")

# Count it's lines, but not the header, nrows=5k at a time
t1 <- system.time({
	nlines <- numTextFileLines("test.txt", header=TRUE)
cat(" Lines read:", nlines, "\n", "Last block:", nlines %% 5000, "\n")

# Clean-up

> I have a large TXT (X,Y,MyValue) file in a directory and I wish to import
> row by row the txt in a loop to save only the data they are inside a
> buffer
> (using inside.owin of spatstat) and delete the rest. 

Maybe you don't need to count the number of rows on the file,
you could adapt the code above to process it in blocks.
Something like

# Start of the function code is the same
		if (any(grepl("no lines available", unclass(x))))
                # Process 'x', row-wise
                apply(x, 1, MyFunction)
		if(nrow(x) < nrows){
                     ... etc ...

Hope this helps,
Rui Barradas

View this message in context: http://r.789695.n4.nabble.com/count-how-many-row-i-have-in-a-txt-file-in-a-directory-tp4422186p4422549.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list