[Rd] extending the colClasses argument in read.table

Romain François romain at r-enthusiasts.com
Mon Nov 21 10:59:49 CET 2011


Hello,

We've released the int64 package to CRAN a few days ago. The package 
provides S4 classes "int64" and "uint64" that represent signed and 
unsigned 64 bit integer vectors.

One further development of the package is to facilitate reading 64 bit 
integer data from csv, etc ... files.

I have this function that wraps a call to read.csv to:
- read the "int64" and "uint64" columns as "character"
- converts them afterwards to the appropriate type


read.csv.int64 <- function (file, ...){
     dots <- list( file, ... )
     if( "colClasses" %in% names(dots) ){
         colClasses <- dots[["colClasses"]]
         idx.int64 <- colClasses == "int64"
         idx.uint64 <- colClasses == "uint64"

         colClasses[ idx.int64 | idx.uint64 ] <- "character"
         dots[["colClasses" ]] <- colClasses

         df <- do.call( "read.csv", dots )
         if( any( idx.int64 ) ){
             df[ idx.int64 ] <- lapply( df[ idx.int64 ], as.int64 )
         }
         if( any( idx.uint64 ) ){
             df[ idx.uint64 ] <- lapply( df[ idx.uint64 ], as.uint64 )
         }
         df


     } else {
         read.csv( file, ... )
     }
}

I was wondering if it would make sense to extend the colClasses argument 
so that other package can provide drivers, so that we could let the 
users just use read.csv, read.table, etc ...

Before I start digging into the internals of read.table, I wanted to 
have opinions about whether this would be a good idea, etc ...

Best Regards,

Romain

-- 
Romain Francois
Professional R Enthusiast
http://romainfrancois.blog.free.fr



More information about the R-devel mailing list