[R-sig-DB] Parameterised queries

Hannes Mühleisen H@nne@@Mueh|e|@en @end|ng |rom cw|@n|
Wed Feb 11 20:09:23 CET 2015


Hi Hadley,

> On 11 Feb 2015, at 16:08, Hadley Wickham <h.wickham using gmail.com> wrote:
> 
> On Wed, Feb 11, 2015 at 8:39 AM, Hannes Mühleisen
> <Hannes.Muehleisen using cwi.nl> wrote:
>> Hi Hadley and list,
>> 
>>> On 11 Feb 2015, at 15:01, Hadley Wickham <h.wickham using gmail.com> wrote:
>>> As part of my work modernising R's database connnectors, I've been
>>> working on improving the support for parameterised queries. I'd love
>>> to get your feedback on the API I'm proposing.
>>> 
>>> The goal is to encourage people to use parameterise queries instead of
>>> pasting together SQL strings because it is much safer (no worries
>>> about SQL injection attacks) and somewhat more performant (becuase the
>>> database only needs to parse the query once).
>> Makes a lot of sense, yes. MonetDB.R has had support for this from day one. Our syntax uses the list of parameters approach, e.g.
>> 
>> dbSendUpdate(conn, "INSERT INTO sometable (a) VALUES (?)", “foobar”)
>> 
>> of course, the parameter can be a vector, in which case the query is executed multiple times.
> 
> I deliberately chose not to allow vector parameters because I think
> it's reasonable to say: "Each R function only ever generates a single
> query to the database", leaving the onus of looping on the user, and
> forcing them to think about how to ensure the vectors don't contain
> bad values.  This the same principle behind disallowing multiple
> queries (separated by ";") in a single string.
However, there are optimisation opportunities that the db driver could exploit if multiple values are to be inserted at once. For example, a database connection in auto commit mode could switch of auto commit, and try to insert all values in a transaction to get all-or-nothing semantics. Another opportunity would be a bulk load if the vectors are large.

>> Generally, I would be in favour of the “list of params” approach. Also, please note that the “bind by name” is not supported by all databases. Sticking to position-only parameter binding using ? would be most compatible (also used in JDBC for example).
> 
> Yup, postgresql doesn't support names either. In that case, providing
> a named list would be an error. But where named binding is supported,
> I think it's better to use it because it eliminates a class of
> potential errors.
I think a script using DBI should not need to know which DBI implementation is running behind it. But if someone uses named parameters on a MySQL backend (possibly out of order), that script will not run with Postgres or others. 

Best,

Hannes




-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4154 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-db/attachments/20150211/9c4f187d/attachment.p7s>


More information about the R-sig-DB mailing list