m_olshansky at yahoo.com
Tue Aug 28 07:34:01 CEST 2007
As far as I understand, changing the format changes
the way data is displayed by Excel but this does not
change the data itself - if while reading the data
Excel decided that it was a date, it is being
converted to an integer (the number of days since
January 1, 1900 - and they mistakenly think that 1900
was a leap year) and it is stored this way.
--- David Scott <d.scott at auckland.ac.nz> wrote:
> On Tue, 28 Aug 2007, Robert A LaBudde wrote:
> > If you format the column as "Text", you won't have
> this problem. By
> > leaving the cells as "General", you leave it up to
> Excel to guess at
> > the correct interpretation.
> Not true actually. I had converted the column to
> Text because I saw the
> interpretation as a date in the .xls file. I saved
> the .csv file *after*
> the column had been converted to Text. Looking at
> the .csv file in a text
> editor, the entry is correct.
> I have just rechecked this.
> On reopening the .csv using Excel, the entry AUG2699
> had been interpreted
> as a date, and was showing as Aug-99. Most bizarre
> is that the NHI value
> of AUG1838 has *not* been interpreted as a date.
> David Scott
> > You will note that the conversion to a date occurs
> immediately in
> > Excel when you enter the value. There are many
> formats to enter dates.
> > Either pre-format the column as Text, or prefix
> the individual entry
> > with an ' to indicate text.
> > A similar problem occurs in R's read.table()
> function when a factor
> > has levels that can be interpreted as numbers.
> > At 10:11 PM 8/27/2007, David wrote:
> >> A common process when data is obtained in an
> Excel spreadsheet is to save
> >> the spreadsheet as a .csv file then read it into
> R. Experienced users
> >> might have learned to be wary of dates (as I
> have) but possibly have not
> >> experienced what just happened to me. I thought I
> might just share it with
> >> r-help as a cautionary tale.
> >> I received an Excel file giving patient details.
> Each patient had an ID
> >> code in the form of three letters followed by
> four digits. (Actually a New
> >> Zealand National Health Identification.) I saved
> the .xls file as .csv.
> >> Then I opened up the .csv (with Excel) to look at
> it. In the column of ID
> >> codes I saw: Aug-99. Clicking on that entry it
> showed 1/08/2699.
> >> In a column of character data, Excel had
> interpreted AUG2699 as a date.
> >> The .csv did not actually have a date in that
> cell, but if I had saved the
> >> .csv file it would have.
> >> David Scott
> > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail:
> ral at lcfltd.com
> > Least Cost Formulations, Ltd. URL:
> > 824 Timberlake Drive Tel:
> > Virginia Beach, VA 23464-3239 Fax:
> > "Vere scire est per causas scire"
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > and provide commented, minimal, self-contained,
> reproducible code.
> David Scott Department of Statistics, Tamaki Campus
> The University of Auckland, PB 92019
> Auckland 1142, NEW ZEALAND
> Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
> Email: d.scott at auckland.ac.nz
> Graduate Officer, Department of Statistics
> Director of Consulting, Department of Statistics
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained,
> reproducible code.
More information about the R-help