[R-sig-DB] Proposal of dbQuoteIdentifier for DBI specification

Seth Falcon @eth @end|ng |rom u@erpr|m@ry@net
Sun Nov 14 06:24:02 CET 2010


Hi there,

On Fri, Nov 12, 2010 at 9:31 PM, Tomoaki NISHIYAMA
<tomoakin using kenroku.kanazawa-u.ac.jp> wrote:
> We changed that the SQL identifiers be quoted instead of mangling in
> RPostgreSQL.  This allows proper distinction of capital and small letters
> and make it possible to use the name as is what is defined in R world.
>
> This was applied uniformly to tablename argument for
> dbWriteTable(), dbExistsTable(), dbReadTable(), and dbRemoveTable()
> functions.
>
> However, this makes difficulty in schema usage, as now '.' (period)
> is allowed as a part of tablename and cannot be regarded as a
> separator. To allow construction of schema.table structure like
> "s.che.ma"."ta.ble", passing a vector c('s.che.ma', 'ta.ble')
> instead of string will be perhaps helpful.
> Each element of the vector will be quoted and thereafter connected with '.'.
>
> I am aware that RSQLite had similar discussion for mangling / quoting
> about one year ago
> https://stat.ethz.ch/pipermail/r-sig-db/2009q4/000753.html
> but haven't found what the decision and what the future direction
> is.

A change was made to the make.db.names.default function in the DBI
package to quote (using ") if the allow.keywords argument is set to
FALSE.  This was announced here:

    https://stat.ethz.ch/pipermail/r-sig-db/2009q4/000769.html
    https://stat.ethz.ch/pipermail/r-sig-db/2009q4/000770.html

> In addition while identifier quoting for Postgresql is simple,
> that looks different in MySQL that it is quote with back quote'`'
> rather than double quote and double quote works only under ANSI_QUOTES
> option.  So, if quoting rule is different among backends,
> perhaps it is worth to define dbQuoteIdentifier(con, name)
> as a common way to quote an identifier.
> Perl DBI has quote_identifier and have two forms of synopsis
>  $sql = $dbh->quote_identifier( $name );
>  $sql = $dbh->quote_identifier( $catalog, $schema, $table, \%attr );
> http://search.cpan.org/~timb/DBI/DBI.pm#quote_identifier
> The example given there is interesting that
> $id = $dbh->quote_identifier( 'link', 'schema', 'table' );
> should produce "schema"."table"@"link" for oracle.
>
> So, my proposal is that dbQuoteIdentifier() is defined similary that
> with a single string argument that should be quoted.
> For a vector argument two multiple strings, the two strings is interpreted
> as schema and tablename, and each element are quoted and concatenated
> according
> to the backend specification.
> For a vector argument with three elements, the elements are interpreted as
> dbname, schema, and tablename. Each element are quoted and concatenated
> using
> the rule appropriate for the back-end database management system.
>
> I think quote everything is the right way to go,
> and quoting should be made easier.

I think I agree that quoting everything is a reasonable approach.
Having a means to determine the correct quote character based on the
backend and its current settings seems valuable.  I would support a
patch along those lines.  Care should be taken so that extra layers of
function calls are not added to every db operation to determine and
apply quoting (IMO).

So I'm generally in favor of something along the lines of
dbQuoteIdentifier.  However, I don't like the semantics you have
proposed for the behavior of dbQuoteIdentifier in terms of the return
value for vectors based on length.  Such a function should do what it
says, quote identifiers.  Having a separate function to deal with
dbname, schema, and tablename concatenation might be useful, but that
feels like something to handle separately.

+ seth



-- 
Seth Falcon | @sfalcon | http://userprimary.net/




More information about the R-sig-DB mailing list