[R] reading in only one column from text file
Seth Falcon
sfalcon at fhcrc.org
Tue Mar 7 22:56:53 CET 2006
"mark salsburg" <mark.salsburg at gmail.com> writes:
> How do I manipulate the read.table function to read in only the 2nd
> column???
If your data is small, you can read in all columns and then subset the
resulting data frame. Try that first.
Perhaps there is a nicer way to do this that I don't know about, but
recently I coded up the following to allow for a "streamy" read.table.
I've adjusted a few things, but haven't tested. May not work as is,
but it should give you an idea.
+ seth
readBatch <- function(con, batch.size) {
colClasses <- rep("character", 20) ## fix for your data
## adjust to pick out the columns that you want
read.csv(con, colClasses=colClasses, as.is=TRUE,
nrows=batch.size, header=FALSE)[, 1:2]
}
readTableStreamily <- function(filePath) {
BATCH_SIZE <- 5000 ## no idea what a good value is depends on file and RAM
con <- file(filePath, 'r')
colNames <- readBatch(con, batch.size=1)
chunks <- list()
i <- 1
done <- FALSE
while (!done) {
done <- tryCatch({
cat(".")
chunks[[i]] <- readBatch(con, batch.size=BATCH_SIZE)
i <- i + 1
FALSE
}, error=function(e) TRUE)
}
close(con)
cat("\n")
df <- do.call("rbind", chunks)
names(df) <- colNames
df
}
More information about the R-help
mailing list