[R] Excel

David Scott d.scott at auckland.ac.nz
Tue Aug 28 07:21:55 CEST 2007

On Tue, 28 Aug 2007, Robert A LaBudde wrote:

> If you format the column as "Text", you won't have this problem. By
> leaving the cells as "General", you leave it up to Excel to guess at
> the correct interpretation.

Not true actually. I had converted the column to Text because I saw the 
interpretation as a date in the .xls file. I saved the .csv file *after* 
the column had been converted to Text. Looking at the .csv file in a text 
editor, the entry is correct.

I have just rechecked this.

On reopening the .csv using Excel, the entry AUG2699 had been interpreted 
as a date, and was showing as Aug-99. Most bizarre is that the NHI value 
of AUG1838 has *not* been interpreted as a date.

David Scott

> You will note that the conversion to a date occurs immediately in
> Excel when you enter the value. There are many formats to enter dates.
> Either pre-format the column as Text, or prefix the individual entry
> with an ' to indicate text.
> A similar problem occurs in R's read.table() function when a factor
> has levels that can be interpreted as numbers.
> At 10:11 PM 8/27/2007, David wrote:
>> A common process when data is obtained in an Excel spreadsheet is to save
>> the spreadsheet as a .csv file then read it into R. Experienced users
>> might have learned to be wary of dates (as I have) but possibly have not
>> experienced what just happened to me. I thought I might just share it with
>> r-help as a cautionary tale.
>> I received an Excel file giving patient details. Each patient had an ID
>> code in the form of three letters followed by four digits. (Actually a New
>> Zealand National Health Identification.) I saved the .xls file as .csv.
>> Then I opened up the .csv (with Excel) to look at it. In the column of ID
>> codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.
>> In a column of character data, Excel had interpreted AUG2699 as a date.
>> The .csv did not actually have a date in that cell, but if I had saved the
>> .csv file it would have.
>> David Scott
> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
> 824 Timberlake Drive                     Tel: 757-467-0954
> Virginia Beach, VA 23464-3239            Fax: 757-467-2947
> "Vere scire est per causas scire"
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Scott	Department of Statistics, Tamaki Campus
 		The University of Auckland, PB 92019
 		Auckland 1142,    NEW ZEALAND
Phone: +64 9 373 7599 ext 86830		Fax: +64 9 373 7000
Email:	d.scott at auckland.ac.nz

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

More information about the R-help mailing list