[R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column
Maciej Hoffman-Wecker
Maciej.Hoffman-Wecker at bioskin.de
Tue Aug 14 13:33:27 CEST 2007
Dear Professor Ripley,
Thank you very much for your response. I send the problem, as I didn't have any more ideas were to search for the reason. I didn't say this is a R bug, knowing the responses on such mails.-)
But I succeeded in developing a tiny example, that reproduces the bug (wherever it is).
I generated a small Access data base "test2.mdb" with one table "Tab1" and following columns:
"Field name" "Field type"
F1 Number
NO Number
F2 Number
(sorry if the Access identifiers are not the standard ones, as I have a german Access version)
The content of the "Tab1" table is:
F1 NO F2
1 1 1
2 2 2
0 1
1 0 0
(The column "NO" contains one missing)
Now if I import the data into R, I get the following results:
> library(RODBC)
> .con <- odbcConnectAccess("./test2.mdb")
> (.d <- try(sqlQuery(.con, "select * from Tab1")))
F1 NO F2
1 1 1 1
2 2 2 2
3 0 NA 1
4 1 0 0
> (.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))
F1 Expr1001 F2
1 1 0 1
2 2 0 2
3 0 0 1
4 1 0 0
> close(.con)
So the problem occurs if the column names are specified within the query.
Is the query "select F1 , NO , F2 from Tab1" invalid?
Regarding the memory issue, I _knew_ that there must be a reason for the running out of memory space. Sorry for not being more specific. My question than is:
Is there a way to 'reset' the environment without quitting R and restarting it?
Thank you for your help.
Kind regards,
Maciej
-----Ursprüngliche Nachricht-----
Von: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help at stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column
On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:
>
> Dear all,
>
> I have some problems with importing data from an Access data base via
> RODBC to R. The data base contains several tables, which all are
> imported consecutively. One table has a column with column name "NO".
> If I run the code attached on the bottom of the mail I get no
> complain, but the column name (name of the respective vector of the
> data.frame) is "Expr1014" instead of "NO". Additionally the original
> column (type
> "text") containes "0"s and missings, but the imported column contains
> "0"s only (type "int"). If I change the column name in the Access data
> base to "NOx", the import works fine with the right name and the same
> data.
>
> Previously I generated a tiny Access data base which reproduced the
> problem. To be on the safe site I installed the latest version (2.5.1)
> and now the example works fine, but within my production process the
> error still remaines. An import into excel via ODBC works fine.
>
> So there is no way to figure it out whether this is a bug or a
> feature.-)
It's most likely an ODBC issue, but you have not provided a reproducible example.
> The second problem I have is that when I rerun "rm(list = ls(all =
> T)); gc()" and the import several times I get the following error:
>
> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
> memory In addition: Warning messages:
> 1: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 2: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 3: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
> 4: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
>
> which is surprising to me, as the first two statements should delete
> all
How do you _know _what they 'should' do? That only deletes all objects in the workspace, not all objects in R, and not all memory blocks used by R.
Please do read ?"Memory-limits" for the possible reasons.
Where did '447Mb' come from? If this machine has less than 2Gb of RAM, buy some more.
> objects and recover the memory. Is this only a matter of memory? Is
> there any logging that reduces the memory? Or is this issue connected to
> the upper problem?
>
> I added the code on the bottom - maybe there is some kind of misuse I
> lost sight of. Any hints are appreciated.
>
> Kind regards,
> Maciej
>
>> version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 5.1
> year 2007
> month 06
> day 27
> svn rev 42083
> language R
> version.string R version 2.5.1 (2007-06-27)
>
>
> ## code
>
> get.table <- function(name, db, drop = NULL){
> .con <- try(odbcConnectAccess(db), silent = T)
> if(!inherits(.con, "RODBC")) return(.con)
> ## exclude memo columns
> .t <- try(sqlColumns(.con, name))
> if(inherits(.t, "try-error")){close(.con); return(.t)}
> .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
> .t <- paste(.t, collapse = ",")
> ## get table
> .t <- paste("select", .t, "from", name)
> .d <- try(sqlQuery(.con, .t), silent = T)
> if(inherits(.d, "try-error")){close(.con); return(.d)}
> .con <- try(close(.con), silent = T)
> if(inherits(.con, "try-error")) return(.con)
> .d <- .d[!names(.d) %in% drop]
> return(.d)
> }
>
> get.alltables <- function(db){
> .con <- try(odbcConnectAccess(db), silent = T)
> if(!inherits(.con, "RODBC")) return(.con)
> .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
> if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
> .con <- try(close(.con), silent = T)
> if(inherits(.con, "try-error")) return(.con)
> .tbls <- .tbls[-grep("^MSys", .tbls)]
> .d <- lapply(seq(along = .tbls), function(.i){
> .d <-
> try(get.table(.tbls[.i], db = db))
> return(invisible(.d))
> })
> names(.d) <- .tbls
> .ok <- !sapply(.d, inherits, "try-error")
> return(list(notdone = .d[!.ok], data = .d[.ok]))
> }
>
> library(RODBC)
>
> alldata <- get.alltables(db = "./myaccessdb.MDB")
>
> ## code end
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list