[R-sig-DB] Parameterised queries

Wed Feb 11 21:19:35 CET 2015

On Wed, Feb 11, 2015 at 1:49 PM, Hadley Wickham <h.wickham using gmail.com> wrote:

> >> I deliberately chose not to allow vector parameters because I think
> >> it's reasonable to say: "Each R function only ever generates a single
> >> query to the database", leaving the onus of looping on the user, and
> >> forcing them to think about how to ensure the vectors don't contain
> >> bad values.  This the same principle behind disallowing multiple
> >> queries (separated by ";") in a single string.
>
> > However, there are optimisation opportunities that the db driver could
> exploit if multiple values are to be inserted at once. For example, a
> database connection in auto commit mode could switch of auto commit, and
> try to insert all values in a transaction to get all-or-nothing semantics.
> Another opportunity would be a bulk load if the vectors are large.
>
> Agreed, but it's also fundamentally dangerous. I think this should be
> a separate function with that clearly describes the performance-safety
> trade off, maybe dbBindAll()?  Alternatively, you could have an
> additional `vectorise` argument that defaulted to FALSE.
>

I don't understand what is dangerous about repeatedly executing a prepared
query. Can you give a scenario?

>
> That said, can you think of a use case apart from doing bulk inserts?
> In RSQLite, I use an internal function to get optimal performance for
> dbReadTable without generally exposing a more dangerous api.
>

Other than the convenience, and it being sort of R-like, yes, bulk inserts
is the main application in my case.

>
> >>> Generally, I would be in favour of the "list of params" approach.
> Also, please note that the "bind by name" is not supported by all
> databases. Sticking to position-only parameter binding using ? would be
> most compatible (also used in JDBC for example).
> >>
> >> Yup, postgresql doesn't support names either. In that case, providing
> >> a named list would be an error. But where named binding is supported,
> >> I think it's better to use it because it eliminates a class of
> >> potential errors.
>
> > I think a script using DBI should not need to know which DBI
> implementation is running behind it. But if someone uses named parameters
> on a MySQL backend (possibly out of order), that script will not run with
> Postgres or others.
>
> That's a noble goal, but extremely difficult in principle because of
> the variations in SQL support across backends. I'd prefer not to take
> a lowest common denominator approach.
>

I thought that was the intent of DBI.

THK

>
> Hadley
>
> --
> http://had.co.nz/
>
> _______________________________________________
> R-sig-DB mailing list -- R Special Interest Group
> R-sig-DB using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-db
>

-- 
http://www.keittlab.org/

	[[alternative HTML version deleted]]