[Rd] extending the colClasses argument in read.table
Romain François
romain at r-enthusiasts.com
Mon Nov 21 10:59:49 CET 2011
Hello,
We've released the int64 package to CRAN a few days ago. The package
provides S4 classes "int64" and "uint64" that represent signed and
unsigned 64 bit integer vectors.
One further development of the package is to facilitate reading 64 bit
integer data from csv, etc ... files.
I have this function that wraps a call to read.csv to:
- read the "int64" and "uint64" columns as "character"
- converts them afterwards to the appropriate type
read.csv.int64 <- function (file, ...){
dots <- list( file, ... )
if( "colClasses" %in% names(dots) ){
colClasses <- dots[["colClasses"]]
idx.int64 <- colClasses == "int64"
idx.uint64 <- colClasses == "uint64"
colClasses[ idx.int64 | idx.uint64 ] <- "character"
dots[["colClasses" ]] <- colClasses
df <- do.call( "read.csv", dots )
if( any( idx.int64 ) ){
df[ idx.int64 ] <- lapply( df[ idx.int64 ], as.int64 )
}
if( any( idx.uint64 ) ){
df[ idx.uint64 ] <- lapply( df[ idx.uint64 ], as.uint64 )
}
df
} else {
read.csv( file, ... )
}
}
I was wondering if it would make sense to extend the colClasses argument
so that other package can provide drivers, so that we could let the
users just use read.csv, read.table, etc ...
Before I start digging into the internals of read.table, I wanted to
have opinions about whether this would be a good idea, etc ...
Best Regards,
Romain
--
Romain Francois
Professional R Enthusiast
http://romainfrancois.blog.free.fr
More information about the R-devel
mailing list