[R] RODBC, missing values, and Excel
Sundar Dorai-Raj
sundar.dorai-raj at pdf.com
Thu Jul 13 13:43:56 CEST 2006
Hi, Gabor,
Thanks for the code. When I tried this I get an error when trying to use
a relative path name:
read.excel <- function(file, sheet, ...) {
require(rcom)
require(gdata)
oxl <- comCreateObject('Excel.Application')
comSetProperty(oxl, "Visible", TRUE) # this line optional
owb <- comGetProperty(oxl, "Workbooks")
ob <- comInvoke(owb, "Open", file)
osheets <- comGetProperty(ob, "Worksheets")
n <- comGetProperty(osheets, "Count")
ithSheetName <- function(i)
comGetProperty(comGetProperty(osheets, "Item", i), "Name")
sheetNames <- sapply(1:n, ithSheetName)
comInvoke(oxl, "Quit")
read.xls(file, match(sheet, sheetNames), ...)
}
> read.excel("tmp.xls", "Sheet2", na.strings = "na")
Error in 1:n : NA/NaN argument
> read.excel("D:/Users/sundard/frm/config/R/tmp.xls",
+ "Sheet2", na.strings = "na")
x
1 0.11
2 0.11
3 NA
4 NA
5 NA
6 NA
7 0.11
Any reason I need an absolute path?
Thanks again,
--sundar
Gabor Grothendieck wrote:
> In thinking about this some more I have a better idea. Use rcom (or
> RDCOMClient)
> to get a list of the sheet names and then use that to determine which sheet you
> need. Then use read.xls to get it like this assuming that the Excel
> file and path are C:\test.xls and that one of the sheets in that spreadsheet
> is xyz. In my version the na.strings had a space at the end so you may
> need to change the na.strings= setting:
>
> library(rcom)
> xls <- "C:\\test.xls"
> oxl <- comCreateObject('Excel.Application')
> comSetProperty(oxl, "Visible", TRUE) # this line optional
> owb <- comGetProperty(oxl, "Workbooks")
> ob <- comInvoke(owb, "Open", xls)
> osheets <- comGetProperty(ob, "Worksheets")
> n <- comGetProperty(osheets, "Count")
> ithSheetName <- function(i)
> comGetProperty(comGetProperty(osheets, "Item", i), "Name")
> sheetNames <- sapply(1:n, ithSheetName)
> comInvoke(oxl, "Quit")
>
> library(gdata)
> read.xls(xls, match("xyz", sheetNames), na.strings = "na ")
>
>
> On 7/12/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
>
>>Would it be good enough to just read all the sheets in?
>>
>>The perl program can do that and although the read.xls R function does not
>>interface to that aspect of its functionality its not that difficult to access
>>it yourself. Assume your excel file is in \test.xls . Just
>>switch to that folder. paste together a command to run the perl
>>program, run it, get a list of the file names it produced and read them in:
>>
>>library(gdata)
>>setwd("/")
>>cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test")
>>system(cmd)
>>ff <- list.files(patt = "test_Sheet.*.csv")
>>sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)
>>
>>
>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
>>
>>>Hi, Gabor,
>>>
>>>Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
>>>the RODBC maintainer.
>>>
>>>Unfortunately, gdata::read.xls will not work for me (at least I don't
>>>think it will) because I need to refer to each worksheet by name and not
>>>by number. For example, I need extract data from "Sheet1" and not simply
>>>the first sheet.
>>>
>>>Thanks,
>>>
>>>--sundar
>>>
>>>Gabor Grothendieck wrote:
>>>
>>>>I also got a strange result too (I renamed it sdr.read.xls
>>>>to distinguish it from read.xls in gdata and noticed that a
>>>>space got into my na's somehow so I used "na " for my
>>>>na.strings:
>>>>
>>>>
>>>>
>>>>>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
>>>>
>>>> x
>>>>1 <NA>
>>>>2 <NA>
>>>>3 na
>>>>4 na
>>>>5 na
>>>>6 na
>>>>7 <NA>
>>>>
>>>>I had more success using read.xls in the gdata package.
>>>>Note that we need to install perl first if not already present:
>>>>
>>>>
>>>>
>>>>>library(gdata) # for read.xls
>>>>>read.xls("/test.xls", 2, na.strings = "na ")
>>>>
>>>> x
>>>>1 0.11
>>>>2 0.11
>>>>3 NA
>>>>4 NA
>>>>5 NA
>>>>6 NA
>>>>7 0.11
>>>>
>>>>
>>>>
>>>>>R.version.string # XP
>>>>
>>>>[1] "Version 2.3.1 Patched (2006-06-04 r38279)"
>>>>
>>>>
>>>>>packageDescription("gdata")$Version
>>>>
>>>>[1] "2.1.2"
>>>>
>>>>
>>>>>packageDescription("RODBC")$Version
>>>>
>>>>[1] "1.1-7"
>>>>
>>>>
>>>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
>>>>
>>>>
>>>>>Hi, all,
>>>>>
>>>>>I'm trying to use RODBC to read data from Excel. However, I'm having
>>>>>trouble converting missing values to NA and rather perplexed by the
>>>>>output. Below illustrates my problem:
>>>>>
>>>>>## DATA - copy to Excel and save as "tmp.xls"
>>>>>## tmp.xls!Sheet1
>>>>>x
>>>>>0.11
>>>>>0.11
>>>>>na
>>>>>na
>>>>>na
>>>>>0.11
>>>>>
>>>>>## tmp.xls!Sheet2
>>>>>x
>>>>>0.11
>>>>>0.11
>>>>>na
>>>>>na
>>>>>na
>>>>>na
>>>>>0.11
>>>>>
>>>>>## R Code
>>>>>read.xls <- function(file, sheet = "Sheet1", ...) {
>>>>> require(RODBC)
>>>>> channel <- odbcConnectExcel(file)
>>>>> sheet <- sprintf("select * from `%s$`", sheet)
>>>>> x <- sqlQuery(channel, sheet, ...)
>>>>> odbcClose(channel)
>>>>> x
>>>>>}
>>>>>
>>>>>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
>>>>>## works as expected
>>>>># x
>>>>>#1 0.11
>>>>>#2 0.11
>>>>>#3 NA
>>>>>#4 NA
>>>>>#5 NA
>>>>>#6 0.11
>>>>>
>>>>>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
>>>>>## Huh? What happened?
>>>>># x
>>>>>#1 NA
>>>>>#2 NA
>>>>>#3 NA
>>>>>#4 NA
>>>>>#5 NA
>>>>>#6 NA
>>>>>#7 NA
>>>>>
>>>>>
>>>>>>sessionInfo()
>>>>>
>>>>>Version 2.3.1 (2006-06-01)
>>>>>i386-pc-mingw32
>>>>>
>>>>>attached base packages:
>>>>>[1] "methods" "stats" "graphics" "grDevices" "utils" "datasets"
>>>>>[7] "base"
>>>>>
>>>>>other attached packages:
>>>>> RODBC
>>>>>"1.1-7"
>>>>>
>>>>>______________________________________________
>>>>>R-help at stat.math.ethz.ch mailing list
>>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>>
>>>>
>>>>
>>>>______________________________________________
>>>>R-help at stat.math.ethz.ch mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list