[R-sig-DB] Parameterised queries

Hadley Wickham h@w|ckh@m @end|ng |rom gm@||@com
Wed Feb 11 21:24:27 CET 2015


>> >> I deliberately chose not to allow vector parameters because I think
>> >> it's reasonable to say: "Each R function only ever generates a single
>> >> query to the database", leaving the onus of looping on the user, and
>> >> forcing them to think about how to ensure the vectors don't contain
>> >> bad values.  This the same principle behind disallowing multiple
>> >> queries (separated by ";") in a single string.
>>
>> > However, there are optimisation opportunities that the db driver could
>> > exploit if multiple values are to be inserted at once. For example, a
>> > database connection in auto commit mode could switch of auto commit, and try
>> > to insert all values in a transaction to get all-or-nothing semantics.
>> > Another opportunity would be a bulk load if the vectors are large.
>>
>> Agreed, but it's also fundamentally dangerous. I think this should be
>> a separate function with that clearly describes the performance-safety
>> trade off, maybe dbBindAll()?  Alternatively, you could have an
>> additional `vectorise` argument that defaulted to FALSE.
>
> I don't understand what is dangerous about repeatedly executing a prepared
> query. Can you give a scenario?

It gives a new attack vector - to introduce additional data into the
database, you just need to figure out how to turn a length 1 vector in
to a length 2 vector.

It's dangerous in the same way that allowing dbGetQuery() to execute
multiple queries is dangerous.

>> That said, can you think of a use case apart from doing bulk inserts?
>> In RSQLite, I use an internal function to get optimal performance for
>> dbReadTable without generally exposing a more dangerous api.
>
> Other than the convenience, and it being sort of R-like, yes, bulk inserts
> is the main application in my case.

Great.

Hadley

-- 
http://had.co.nz/




More information about the R-sig-DB mailing list