[R] Reading XML files masquerading as XL files
Dennis Fisher
fisher at plessthan.com
Wed Aug 10 16:26:35 CEST 2011
R version 2.13.1
OS X (or Windows)
Colleagues,
I received a number of files with a .xls extension. These files open in XL and, by all appearances, are XL files. However, it appears to me that the files are actually XML:
> readLines(dir()[16])[1:10]
[1] "<?xml version=\"1.0\"?>"
[2] "<Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\""
[3] " xmlns:o=\"urn:schemas-microsoft-com:office:office\""
[4] " xmlns:x=\"urn:schemas-microsoft-com:office:excel\""
[5] " xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\""
[6] " xmlns:html=\"http://www.w3.org/TR/REC-html40\">"
[7] " <DocumentProperties xmlns=\"urn:schemas-microsoft-com:office:office\">"
[8] " <Version>12.0</Version>"
[9] " </DocumentProperties>"
[10] " <OfficeDocumentSettings xmlns=\"urn:schemas-microsoft-com:office:office\">"
I had initially tried to read the files using read.xls (gdata) but that failed (not surprisingly). I could open each Excel file, then "save as" csv, then use read.csv. However, there are many files so I would love to have a solution that does not require this brute force approach.
Are there any packages that would allow me to read these files without the additional steps?
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
More information about the R-help
mailing list