[R-sig-DB] Parameterised queries

Hadley Wickham h@w|ckh@m @end|ng |rom gm@||@com
Wed Feb 11 20:49:10 CET 2015


>> I deliberately chose not to allow vector parameters because I think
>> it's reasonable to say: "Each R function only ever generates a single
>> query to the database", leaving the onus of looping on the user, and
>> forcing them to think about how to ensure the vectors don't contain
>> bad values.  This the same principle behind disallowing multiple
>> queries (separated by ";") in a single string.

> However, there are optimisation opportunities that the db driver could exploit if multiple values are to be inserted at once. For example, a database connection in auto commit mode could switch of auto commit, and try to insert all values in a transaction to get all-or-nothing semantics. Another opportunity would be a bulk load if the vectors are large.

Agreed, but it's also fundamentally dangerous. I think this should be
a separate function with that clearly describes the performance-safety
trade off, maybe dbBindAll()?  Alternatively, you could have an
additional `vectorise` argument that defaulted to FALSE.

That said, can you think of a use case apart from doing bulk inserts?
In RSQLite, I use an internal function to get optimal performance for
dbReadTable without generally exposing a more dangerous api.

>>> Generally, I would be in favour of the “list of params” approach. Also, please note that the “bind by name” is not supported by all databases. Sticking to position-only parameter binding using ? would be most compatible (also used in JDBC for example).
>>
>> Yup, postgresql doesn't support names either. In that case, providing
>> a named list would be an error. But where named binding is supported,
>> I think it's better to use it because it eliminates a class of
>> potential errors.

> I think a script using DBI should not need to know which DBI implementation is running behind it. But if someone uses named parameters on a MySQL backend (possibly out of order), that script will not run with Postgres or others.

That's a noble goal, but extremely difficult in principle because of
the variations in SQL support across backends. I'd prefer not to take
a lowest common denominator approach.

Hadley

-- 
http://had.co.nz/




More information about the R-sig-DB mailing list