[R-sig-DB] SQL escaping/quoting proposal

Hadley Wickham h@w|ckh@m @end|ng |rom gm@||@com
Fri Oct 18 18:44:23 CEST 2013


Hi all,

The approach that DBI takes to escaping is sub-optimal: it tries to
figure out if an R variable name matches an SQL reserved word, and if
so munge it so that there's no longer a conflict. This creates a
situation where there are some identifiers that are valid in R, and
some that are valid in SQL and we have a complicated and bug prone
approach to converting between them.

Instead, I recommend taking an approach where identifiers (i.e. table
and field names) are always quoted using the appropriate database
syntax. This not only avoids any problems with SQL reserved words, but
it also ensures that every field name in R (even those containing
spaces and other special characters) can be used in SQL.

To achieve this change, I think we should to:

* deprecate `make.db.names()`, `isSQLKeyword()`, and `SQLKeywords()`
* add new generics `sqlQuoteString()` and `sqlQuoteIdentifier()`.

The new generics would be defined on the driver object, and would come
with default methods as follows:

```
setGeneric("sqlQuoteString", function(drv, x, ...) {
  standardGeneric("sqlQuoteString")
})
setMethod("sqlQuoteString", "DBIDriver", function(drv, x, ...) {
  x <- gsub('"', '""', x, fixed = TRUE)
  paste('"', x, '"', sep = "")
})

setGeneric("sqlQuoteIdentifer", function(drv, x, ...) {
  standardGeneric("sqlQuoteIdentifer")
})
setMethod("sqlQuoteString", "DBIDriver", function(drv, x, ...) {
  x <- gsub("'", "''", x, fixed = TRUE)
  paste("'", x, "'", sep = "")
})
```

Individual implementations would be encouraged to provide methods that
use the quoting functions provided by the client library, where
available.

Does anyone see any problems with this approach?

Hadley


-- 
Chief Scientist, RStudio
http://had.co.nz/




More information about the R-sig-DB mailing list