[R] How to set a filter during reading tables

Gabor Grothendieck ggrothendieck at gmail.com
Sun May 31 20:08:42 CEST 2009


The sqldf package can read a subset of rows and columns into R without
reading the entire file into R.  There are a few caveats:

- It does not support ftp so you will need to download the file to your
  computer first as shown in the example below
- since value is an SQL keyword it turns value into value__1 to avoid
a collision.
- you will have to convert the value column to numeric yourself as shown:

library(sqldf)
download.file("ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals",
"Chemicals.txt", method = "wget")

# define wp as a file with indicated format
wp <- file("Chemicals.txt")
attr(wp, "file.format") <- list(sep = "\t", header = TRUE)

# use sqldf to read it in keeping only indicated rows
wp.df <- sqldf("select * from wp where footnote_codes = 'p' and period = 'M01'")

# fix up type of value__1
wp.df$value__1 <- as.numeric(as.character(wp.df$value__1))

head(wp.df)

See http://sqldf.googlecode.com



On Wed, May 27, 2009 at 12:27 PM,  <guox at ucalgary.ca> wrote:
> We are reading big tables, such as,
>
> Chemicals <-
> read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
> = TRUE, sep = '\t', as.is =T)
>
> I was wondering if it is possible to set a filter during loading so that
> we just load what we want not the whole table each time. Thanks,
>
> -james
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list