[R] Yet another question about importing Excel *.xls files
David Scott
d.scott at auckland.ac.nz
Wed Jan 20 23:38:07 CET 2010
Jerry Floren wrote:
> Hi Gabor,
>
> Thanks for your insights and suggestions. There was a post on the Wiki you
> mentioned that makes me think this will work.
>
> Unfortunately, the spreadsheet was designed to make it easy for lab staff to
> enter their results, but not so easy for another program to read in the
> data. That is why I copy and paste the original spreadsheet into a second
> spreadsheet with a separate page that could be read beginning at the top
> (A1) cell. Here is a link to the spreadsheet the labs use.
>
> http://www.mda.state.mn.us/en/sitecore/content/Global/MDADocs/licensing/map/mapreportform.aspx
>
> A big problem with this format is that the lab's ID code ends up in each
> column while the data is organized by rows. Also, there are three different
> samples with three different replicates. The spreadsheet is set up so that
> the replicates are not identified. Even if the data had to be copied to a
> second spreadsheet (with the conversion page), it would be a significant
> improvement over the current method. I just learned that it is possible to
> hide a worksheet in Excel. (I do not want the lab staff to see the hidden
> worksheet as then the replicates can be identified).
>
> The spreadsheets are e-mailed to me. I have to open them up and save them
> with a different file name. If the first worksheet is hidden, but has the
> data arranged properly starting in cell A1, it looks like the entire(hidden)
> worksheet be read in according to this from the Wiki:
>
> "Entire Worksheets
> Use the following code to import in all of worksheet called “Sheet 1”. The
> hazard with this approach is that any and all data in that worksheet will be
> copied in, this includes data that are hidden or that you otherwise were not
> intending to bring in.
>
> library(RODBC)
>
> MyExcelData <- sqlFetch(odbcConnectExcel("Test.xls"),
> sqtable = "Sheet1", na.strings = "NA", as.is = T)
> odbcCloseAll()Caution
> Excel 2003 (and earlier?) use the first 0-16 rows to guess the data type.
> Consider a column of international postal codes where the first 20 rows
> contain 50010 and the next two rows contain 500A1 and 500E1. The value of
> ‘500A1’ is likely to be interpreted as a missing value and the value of
> ‘500E1’ may be interpreted as a numeric value that is in exponential format.
> More information can be found here:
> http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/.
> "
>
> I'm going to play around with this and see if I can get it to work for my
> data.
>
> Thanks,
>
> Jerry Floren
> Minnesota Department of Agriculture
>
As an alternative to Gabor's suggestions, I have always had excellent
results from xlsReadWrite which is back on CRAN.
The author, Hans-Peter Suter, also offers xlsReadWritePro at
http://www.swissr.org/ which can write native xls files with multiple
sheets.
David Scott
--
_________________________________________________________________
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142, NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018
Director of Consulting, Department of Statistics
More information about the R-help
mailing list