[R] sqldf file specification, non-ASCII

Duncan Murdoch murdoch at stats.uwo.ca
Thu Apr 3 16:58:12 CEST 2008


On 4/3/2008 10:22 AM, Peter Jepsen wrote:
> Dear R-Listers,
> 
> I am a Windows user (R 2.6.2) using the development version of sqldf to
> try to read a 3GB file originally stored in .sas7bdat-format. I convert
> it to comma-delimited ASCII format with StatTransfer before trying to
> import just the rows I need into R. The problem is that I get this
> error:
> 
>> f <- file("hugedata.csv")
>> DF <- sqldf("select * from f where C_OPR like 'KKA2%'",
> file.format=list(header=T, row.names=F))
> Error in try({ : 
>   RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740 expected
> 52 columns of data but found 19)
> Error in sqliteExecStatement(con, statement, bind.data) : 
>   RS-DBI driver: (error in statement: no such table: f)

That error message looks pretty clear:  there's a problem on line 
1562740.  Can you look at that line and spot what the problem is?  If 
you don't have a text editor that can handle big files, you should be 
able to do it with something like this:

f <- file("hugedata.csv", "r")

skip <- 1562739
while (skip > 10000) {
   junk <- readLines(f, 10000)
   skip <- skip - 10000
}
junk <- readLines(f, skip)
readLines(f, 1)



> 
> Now, I know that my SAS-using colleagues are able to use this file with
> SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
> format which can be read with the 'read.xport' function in the 'foreign'
> package would be a better approach. 

R can usually read CSV files without a problem.  You've likely got a 
problem in your file on that line; you just need to figure out what it 
is, and fix it.  (It's possible the sqldf function has a bug, but I'd 
suspect the file, first.)

Duncan Murdoch



More information about the R-help mailing list