[R] Using sqldf() to read in .fwf files

Gabor Grothendieck ggrothendieck at gmail.com
Mon Sep 15 18:42:27 CEST 2014


On Mon, Sep 15, 2014 at 12:09 PM, Doran, Harold <HDoran at air.org> wrote:
> I am learning to use sqldf() to read in very large fixed width files that otherwise do not work efficiently with read.fwf. I found the following example online and have worked with this in various ways to read in the data
>
> cat("1 8.3
> 210.3
> 319.0
> 416.0
> 515.6
> 719.8
> ", file = "fixed")
>
> fixed <- file("fixed")
> sqldf("select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from fixed")
>
> I then applied this to my real world data problem though it yields the following error message and I am not sure how to interpret this.
>
> dor <- file("dor")
>> sqldf("select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from dor")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>   line 1 did not have 6 elements
>
> Looking at my .fwf. data in a text editor shows the data are structured as I would expect. In fact, I can read in the first few lines of the file using read.fwf and the data are as I would expect after being read into R.
>

We want it to regard the entire line as one field so specify sep= as
some character not in the file.

    attr(fixed, "file.format") <- list(sep = ";")


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list