[R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

Maciej Hoffman-Wecker Maciej.Hoffman-Wecker at bioskin.de
Wed Aug 15 09:28:26 CEST 2007


Thank you very much Professor Ripley! Afterwards it seems obvious where to look. 

Have a nice day,
Maciej

PS: Yes, my machine has not much memory, but it is sufficient for the smaller trial data. 

-----Ursprüngliche Nachricht-----
Von: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
Gesendet: Dienstag, 14. August 2007 19:14
An: Maciej Hoffman-Wecker
Cc: r-help at stat.math.ethz.ch
Betreff: Re: AW: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

> Dear Professor Ripley,
>
> Thank you very much for your response. I send the problem, as I didn't 
> have any more ideas were to search for the reason. I didn't say this 
> is a R bug, knowing the responses on such mails.-)
>
> But I succeeded in developing a tiny example, that reproduces the bug 
> (wherever it is).

Thank you, that was helpful: much easier to follow that the previous code.

...

>> library(RODBC)
>> .con <- odbcConnectAccess("./test2.mdb") (.d <- try(sqlQuery(.con, 
>> "select * from Tab1")))
>  F1 NO F2
> 1  1  1  1
> 2  2  2  2
> 3  0 NA  1
> 4  1  0  0
>> (.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))
>  F1 Expr1001 F2
> 1  1        0  1
> 2  2        0  2
> 3  0        0  1
> 4  1        0  0
>> close(.con)
>
> So the problem occurs if the column names are specified within the query.
> Is the query "select F1 , NO , F2 from Tab1" invalid?

I believe so. 'NO' is an SQL92 and ODBC reserved word, at least according to http://www.bairdgroup.com/reservedwords.cfm

See also http://support.microsoft.com/default.aspx?scid=kb;en-us;286335
which says

   For existing objects with names that contain reserved words, you can
   avoid errors by surrounding the object name with brackets ([ ]).

and lists 'NO' as a reserved word.  RODBC quotes all column names it uses to be sure (and knows about most non-standard quoting mechanisms from the ODBC driver in use).  But this was a query you generated and so you need to do the quoting.

> Regarding the memory issue, I _knew_ that there must be a reason for 
> the running out of memory space. Sorry for not being more specific. My 
> question than is:
>
> Is there a way to 'reset' the environment without quitting R and 
> restarting it?

Sorry, no.  You cannot move objects in memory.

But why '477Mb' is coming up is still unexplained, and suggests that the machine has a peculiar amount of memory or some flag has been used.


>
> Thank you for your help.
>
> Kind regards,
> Maciej
>
>
> -----Ursprüngliche Nachricht-----
> Von: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Gesendet: Dienstag, 14. August 2007 11:51
> An: Maciej Hoffman-Wecker
> Cc: r-help at stat.math.ethz.ch
> Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column
>
> On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:
>
>>
>> Dear all,
>>
>> I have some problems with importing data from an Access data base via
>> RODBC to R. The data base contains several tables, which all are
>> imported consecutively. One table has a column with column name "NO".
>> If I run the code attached on the bottom of the mail I get no
>> complain, but the column name (name of the respective vector of the
>> data.frame) is "Expr1014" instead of "NO". Additionally the original
>> column (type
>> "text") containes "0"s and missings, but the imported column contains
>> "0"s only (type "int"). If I change the column name in the Access data
>> base to "NOx", the import works fine with the right name and the same
>> data.
>>
>> Previously I generated a tiny Access data base which reproduced the
>> problem. To be on the safe site I installed the latest version (2.5.1)
>> and now the example works fine, but within my production process the
>> error still remaines. An import into excel via ODBC works fine.
>>
>> So there is no way to figure it out whether this is a bug or a
>> feature.-)
>
> It's most likely an ODBC issue, but you have not provided a reproducible example.
>
>> The second problem I have is that when I rerun "rm(list = ls(all =
>> T)); gc()" and the import several times I get the following error:
>>
>> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
>> memory In addition: Warning messages:
>> 1: Reached total allocation of 447Mb: see help(memory.size) in:
>> odbcQuery(channel, query, rows_at_time)
>> 2: Reached total allocation of 447Mb: see help(memory.size) in:
>> odbcQuery(channel, query, rows_at_time)
>> 3: Reached total allocation of 447Mb: see help(memory.size) in:
>> odbcTables(channel)
>> 4: Reached total allocation of 447Mb: see help(memory.size) in:
>> odbcTables(channel)
>>
>> which is surprising to me, as the first two statements should delete
>> all
>
> How do you _know _what they 'should' do?  That only deletes all objects in the workspace, not all objects in R, and not all memory blocks used by R.
>
> Please do read ?"Memory-limits" for the possible reasons.
>
> Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy some more.
>
>
>> objects and recover the memory. Is this only a matter of memory? Is
>> there any logging that reduces the memory? Or is this issue connected to
>> the upper problem?
>>
>> I added the code on the bottom - maybe there is some kind of misuse I
>> lost sight of. Any hints are appreciated.
>>
>> Kind regards,
>> Maciej
>>
>>> version
>>               _
>> platform       i386-pc-mingw32
>> arch           i386
>> os             mingw32
>> system         i386, mingw32
>> status
>> major          2
>> minor          5.1
>> year           2007
>> month          06
>> day            27
>> svn rev        42083
>> language       R
>> version.string R version 2.5.1 (2007-06-27)
>>
>>
>> ## code
>>
>> get.table <- function(name, db, drop = NULL){
>>  .con <- try(odbcConnectAccess(db), silent = T)
>>  if(!inherits(.con, "RODBC")) return(.con)
>>  ## exclude memo columns
>>  .t <- try(sqlColumns(.con, name))
>>  if(inherits(.t, "try-error")){close(.con); return(.t)}
>>  .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
>>  .t <- paste(.t, collapse = ",")
>>  ## get table
>>  .t <- paste("select", .t, "from", name)
>>  .d <- try(sqlQuery(.con, .t), silent = T)
>>  if(inherits(.d, "try-error")){close(.con); return(.d)}
>>  .con <- try(close(.con), silent = T)
>>  if(inherits(.con, "try-error")) return(.con)
>>  .d <- .d[!names(.d) %in% drop]
>>  return(.d)
>> }
>>
>> get.alltables <- function(db){
>>  .con <- try(odbcConnectAccess(db), silent = T)
>>  if(!inherits(.con, "RODBC")) return(.con)
>>  .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
>>  if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
>>  .con <- try(close(.con), silent = T)
>>  if(inherits(.con, "try-error")) return(.con)
>>  .tbls <- .tbls[-grep("^MSys", .tbls)]
>>  .d <- lapply(seq(along = .tbls), function(.i){
>>    .d <-
>>      try(get.table(.tbls[.i], db = db))
>>    return(invisible(.d))
>>  })
>>  names(.d) <- .tbls
>>  .ok <- !sapply(.d, inherits, "try-error")
>>  return(list(notdone = .d[!.ok], data = .d[.ok]))
>> }
>>
>> library(RODBC)
>>
>> alldata <- get.alltables(db = "./myaccessdb.MDB")
>>
>> ## code end
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list