[Rd] RODBC: follow up

Jens Oehlschlägel-Akiyoshi jens.oehlschlaegel-akiyoshi@mdfactory.de
Wed, 8 Mar 2000 10:14:38 +0100



Hi Michael,
Here is the result of my first testing:

(0) Dr. Watson
Sorry, I should have given a more precise description: by default in R
exists a function with the name 'table', if you use this name as in

  odbcPrimaryKeys(0, table)

a Dr. Watson occurs, because table does not give a table name.
However, it is always a good idea to prevent Dr. Watsons, even if it is
triggered by a stupid user, as in my case.


(1, may be Davids comment already solves this?)
There are still some problems with sqlTables() and sqlQuery() which should
return a data.frame, but only return one line.
I tried with MS Access and Oracle, and now odbcGetErrMsg() somewhat explains
the problem.

sqlTables(0)
  TABLE_QUALIFIER TABLE_OWNER TABLE_NAME   TABLE_TYPE REMARKS
1    C:\\TEMP\\db          NA   MSysACEs SYSTEM TABLE      NA
Browse[1]> odbcGetErrMsg(channel)
[1] "[RODBC]ERROR: Row count not supported"


Why is this? If it is generally possible, that some ODBC drivers do not
support Row count,
then we need to check odbcGetErrMsg() after each of those functions, which
are expected to return a data.frame.
Or better those functions have to check on their own and to build up the
COMPLETE data frame using odbcFetchRow()
until all rows are fetched. Of course this should not be R-code but C-code,
as looping in R is not a good idea.



(2)
in sqlSave()

  if (any(sqlTables(0)[3] == tablename)) {

probably should read

  if (any(sqlTables(channel)[3] == tablename)) {

furthermore if sqlTables() fails with "[RODBC]ERROR: Row count not
supported"
this check for existence is currently not reliable (sqlTables() silently
returns a one-line data.frame).



(3, see posting to r-devel)
in sqlSave()

  newname <- as.data.frame(cbind(NA, NA, tablename, as.matrix(names(dat)),
    12, "varchar", 255, 255, NA, NA, 1, NA))

creates a data.frame with "" as column names (except for "tablename" as
third column name)
I don't know whether this is a R bug or expected behaviour. However it
currently creates trouble as the later call of

  query <- sqltablecreate(tablename, newname, keys = 0)

fails at

Browse[1]>
debug: if (coldata[i, 11] == 1) {
    null <- "NULL"
} else {
    null <- "NOT NULL"
}
Browse[1]>
Error in [.data.frame(coldata, i, 11) : undefined columns selected


because newname/coldata misses colnames

  debug: print(coldata)
  Browse[1]>
          tablename
1 NA NA        dd V1 12 varchar 255 255 NA NA 1 NA
2 NA NA        dd V2 12 varchar 255 255 NA NA 1 NA


BTW: why use as.matrix(names(dat)) ?



(4) in sqltablecreate

  gsub("[^A-Za-z]+", "", as.character(coldata[i, 4]))

destroys numbers in column names, rather use

  gsub("[^A-Za-z0-9]+", "", as.character(coldata[i, 4]))


Furthermore think about substituting '_' in SQLNAMES to '.' in
RNAMEodbcColumns(channel, table)S and vice versa.
  gsub("[.]", "_", RNAME)
  gsub("[_]", ".", SQLNAME)


(5) in several functions you use

  if (!any(sqlTables(channel) == sqtable)) )

shouldn't that read

  if (!any(sqlTables(channel)[4] == sqtable))

?

The above mentioned one-line-only problem with sqlTables() of course breaks
this code


(6) Some more comments (perhaps due to my ignorance about ODBC, can ODBC
write data without sending sql statements?)

(6.1) looping
sqlwrite() sqlgetresults() uses for-loops for writing/reading data rowwise.
I expect this to be rather slow.
Is it possible to have this done in C instead of R?
If in R, might it be that the sequence of multiple rbind() leads to memory
segmentation (growing lists)??
May be one should ask  on R-Help, if it is better to first create an object
of approbriate size and then replacing single rows.

(6.2) block read/write
Reading all rows at a time with sqlgetresults() could run into memory
trouble.
Fetching single rows using odbcFetchRow() probably creates slow looping
problems in R.
I think the interface would become more suitable for big datasets if it had
functions for writing and reading BLOCKS of rows. E.g. sqlgetresults() could
get a parameter nmax= .

(6.3) sqlCopy()
Convert query results into a data.frame before sending them to another
channel could lead to memory trouble.
Better would be a lowlevel implementation in C, which copies rowwise.

Regards


Jens

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._