[R] sqldf file specification, non-ASCII

Peter Jepsen PJ at DCE.AU.DK
Thu Apr 3 19:29:03 CEST 2008


Thank you for your help, Duncan and Gabor. Yes, I found an early line
feed in line 1562740, so I have corrected that error. The thing is, it
takes me many, many hours to save the file, so I would like to confirm
that there are no more errors further down the file. The ffe tool sounds
like a perfect tool for this job, but it doesn't seem to be available
for Windows. Is anybody out there aware of a similar Windows tool?

Thank you again for your help.
Peter.

-----Original Message-----
From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] 
Sent: 3. april 2008 17:08
To: Peter Jepsen
Subject: Re: [R] sqldf file specification, non-ASCII

One other thing you could try would be to run it through
ffe (fast file extractor) which is a free utility that you can
find via google.  Use the ffe's loose argument.  It can find
bad lines and since its not dependent on R would give
you and independent check.  Regards.

On Thu, Apr 3, 2008 at 10:36 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Hi, Can you try it with the first 100 lines, say, of the data and
> also try reading it with read.csv to double check your arguments
> (note that sql args are similar but not entirely identical to
read.csv)
> and if it still gives this error send me that 100 line file and I will
> look at it tonight or tomorrow.  Regards.
>
>
> On Thu, Apr 3, 2008 at 10:22 AM, Peter Jepsen <PJ at dce.au.dk> wrote:
> > Dear R-Listers,
> >
> > I am a Windows user (R 2.6.2) using the development version of sqldf
to
> > try to read a 3GB file originally stored in .sas7bdat-format. I
convert
> > it to comma-delimited ASCII format with StatTransfer before trying
to
> > import just the rows I need into R. The problem is that I get this
> > error:
> >
> > > f <- file("hugedata.csv")
> > > DF <- sqldf("select * from f where C_OPR like 'KKA2%'",
> > file.format=list(header=T, row.names=F))
> > Error in try({ :
> >  RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740
expected
> > 52 columns of data but found 19)
> > Error in sqliteExecStatement(con, statement, bind.data) :
> >  RS-DBI driver: (error in statement: no such table: f)
> >
> > Now, I know that my SAS-using colleagues are able to use this file
with
> > SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
> > format which can be read with the 'read.xport' function in the
'foreign'
> > package would be a better approach. The problem is, I don't know
> > how/whether I can do that at all with sqldf. I tried various ways
like
> > f <- file(read.xport("hugedata.xport"))
> > but I consistently got an error message from the sqldf command. I
don't
> > recall the exact error message, unfortunately, but can anybody tell
me
> > whether it is at all possible to read in files in non-ASCII format
> > without having to put them in R memory?
> >
> > Thank you for your assistance.
> > Peter.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list