[R] RODBC, missing values, and Excel

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jul 13 04:24:12 CEST 2006


Would it be good enough to just read all the sheets in?

The perl program can do that and although the read.xls R function does not
interface to that aspect of its functionality its not that difficult to access
it yourself.  Assume your excel file is in \test.xls .  Just
switch to that folder.  paste together a command to run the perl
program, run it, get a list of the file names it produced and read them in:

library(gdata)
setwd("/")
cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test")
system(cmd)
ff <- list.files(patt = "test_Sheet.*.csv")
sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)


On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
> Hi, Gabor,
>
> Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
> the RODBC maintainer.
>
> Unfortunately, gdata::read.xls will not work for me (at least I don't
> think it will) because I need to refer to each worksheet by name and not
> by number. For example, I need extract data from "Sheet1" and not simply
> the first sheet.
>
> Thanks,
>
> --sundar
>
> Gabor Grothendieck wrote:
> > I also got a strange result too (I renamed it sdr.read.xls
> > to distinguish it from read.xls in gdata and noticed that a
> > space got into my na's somehow so I used "na " for my
> > na.strings:
> >
> >
> >>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
> >
> >      x
> > 1 <NA>
> > 2 <NA>
> > 3   na
> > 4   na
> > 5   na
> > 6   na
> > 7 <NA>
> >
> > I had more success using read.xls in the gdata package.
> > Note that we need to install perl first if not already present:
> >
> >
> >>library(gdata)  # for read.xls
> >>read.xls("/test.xls", 2, na.strings = "na ")
> >
> >      x
> > 1 0.11
> > 2 0.11
> > 3   NA
> > 4   NA
> > 5   NA
> > 6   NA
> > 7 0.11
> >
> >
> >>R.version.string # XP
> >
> > [1] "Version 2.3.1 Patched (2006-06-04 r38279)"
> >
> >>packageDescription("gdata")$Version
> >
> > [1] "2.1.2"
> >
> >>packageDescription("RODBC")$Version
> >
> > [1] "1.1-7"
> >
> >
> > On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
> >
> >>Hi, all,
> >>
> >>I'm trying to use RODBC to read data from Excel. However, I'm having
> >>trouble converting missing values to NA and rather perplexed by the
> >>output. Below illustrates my problem:
> >>
> >>## DATA - copy to Excel and save as "tmp.xls"
> >>## tmp.xls!Sheet1
> >>x
> >>0.11
> >>0.11
> >>na
> >>na
> >>na
> >>0.11
> >>
> >>## tmp.xls!Sheet2
> >>x
> >>0.11
> >>0.11
> >>na
> >>na
> >>na
> >>na
> >>0.11
> >>
> >>## R Code
> >>read.xls <- function(file, sheet = "Sheet1", ...) {
> >>  require(RODBC)
> >>  channel <- odbcConnectExcel(file)
> >>  sheet <- sprintf("select * from `%s$`", sheet)
> >>  x <- sqlQuery(channel, sheet, ...)
> >>  odbcClose(channel)
> >>  x
> >>}
> >>
> >>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
> >>## works as expected
> >>#     x
> >>#1 0.11
> >>#2 0.11
> >>#3   NA
> >>#4   NA
> >>#5   NA
> >>#6 0.11
> >>
> >>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
> >>## Huh? What happened?
> >>#   x
> >>#1 NA
> >>#2 NA
> >>#3 NA
> >>#4 NA
> >>#5 NA
> >>#6 NA
> >>#7 NA
> >>
> >> > sessionInfo()
> >>Version 2.3.1 (2006-06-01)
> >>i386-pc-mingw32
> >>
> >>attached base packages:
> >>[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
> >>[7] "base"
> >>
> >>other attached packages:
> >>  RODBC
> >>"1.1-7"
> >>
> >>______________________________________________
> >>R-help at stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>
> >
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list