[R] Excel to R

Dirk Eddelbuettel edd at debian.org
Wed Oct 22 04:24:10 CEST 2003


On Tue, Oct 21, 2003 at 08:31:16PM -0400, Gabor Grothendieck wrote:
> 
> 
> I have Excel files containing data that I would like to move to R.
> They are in the standard form of a one row header followed by 
> rows of data, one record per row EXCEPT that there are a few
> rows of comments before the header.  The number of rows of comments
> varies.  For Excel files of this form without comments I have had
> success with:
> 
> require(RODBC)
> z <- odbcConnectExcel("C:/myspread.xls")
> z.df <- sqlFetch(z,"Sheet1")
> close(z)
> 
> but the comments interfere with this.  
> 
> I don't want to manually delete the rows but want the entire
> process from Excel file to R to be automatic.
> 
> I can accomplish this with a free utility, Baird's dataload that 
> I found on the net.  This will convert the Excel files to text 
> and then the text can be processed using R to locate the start of 
> the header and only process the remainder of the file.  (There is
> also another free utility called xlhtml that I don't use, but could 
> have, that does this too.) Thus at this point I have an 
> adequate automated solution.

There is also Spreadsheet::ParseExcel, which comes with a simple xls2csv
which I once extended, and posted here (or maybe only to BDR following a
discussion here).  Being Perl, it can easily be automated, and will cope
with your comment lines. If I recall, ActiveState provides this as well for
win* platforms.

> Nevertheless, I was wondering, for sake of interest, if there is 
> some solution in R that does not involve such an external program
> such as dataload or xlhtml.

There are a few candidates for a cross-platform solution:

- GNU Gretl (an econometric program with a nice Gnome GUI, and a win32 port)
  has code for this, taken from a C program xls2csv as well as from Gnumeric. 
  I had planned to look into this for R, but never got around to it.
  
- Gnumeric just added a standalone tool 'ssconvert', this may compile on
  Windows.  
  
- Also, OpenOffice has code for this which one could extract, but I am not
  familiar with the details.
  
Someone just has to sit down and do it. Typically the person with the
greatest urge wins. As I nowadays get all my data directly from databases
systems, I will probably not be the one.

Hth, Dirk

-- 
Those are my principles, and if you don't like them... well, I have others.
                                                -- Groucho Marx




More information about the R-help mailing list