[R] Excel

David Scott d.scott at auckland.ac.nz
Fri Aug 31 00:13:04 CEST 2007


On Thu, 30 Aug 2007, Duncan Murdoch wrote:

> On 8/28/2007 3:16 AM, J Dougherty wrote:
>> On Monday 27 August 2007 22:21, David Scott wrote:
>>> On Tue, 28 Aug 2007, Robert A LaBudde wrote:
>>>> If you format the column as "Text", you won't have this problem. By
>>>> leaving the cells as "General", you leave it up to Excel to guess at
>>>> the correct interpretation.
>>>
>>> Not true actually. I had converted the column to Text because I saw the
>>> interpretation as a date in the .xls file. I saved the .csv file *after*
>>> the column had been converted to Text. Looking at the .csv file in a text
>>> editor, the entry is correct.
>>>
>>> I have just rechecked this.
>>>
>>> On reopening the .csv using Excel, the entry AUG2699 had been interpreted
>>> as a date, and was showing as Aug-99. Most bizarre is that the NHI value
>>> of AUG1838 has *not* been interpreted as a date.
>>>
>> Actually, in Excel 2000, he's right.  What you have to is be sure of is that
>> the "'" that denotes a text entry precedes EVERY entry that can be confused
>> with a date.  Selecting the entire column and setting the format to "text"
>> *before* data is entered does this.  It will also create an appropriate *.csv
>> file.  Excel is notable too because it will automatically convert "date-like"
>> entries as you type.  In a column of IDs or similar critical data, that
>> behaviour is really bad.  I have never tried the MS site, but I haven't been
>> able to find any entry about how to turn that particular automatic behaviour
>> off.
>>
>> However, while I have not experimented extensively, as far as I have
>> experimented, OpenOffice spreadsheet does not behave this way.
>
> I don't use Excel, but in OpenOffice 2.2.1 the ' is lost when a file is
> saved as .csv and reloaded.  So if I take care and enter
>
> 'November 15
>
> in a cell, then save it, OO will change it to 11/15/2007 when I reload.
>  I can override this change by manually changing "Standard" format to
> "Text" *every time* I load the file.  There's a help index entry "date
> formats;avoiding conversion to", but it offers no more help than "add an
> apostrophe at the beginning of the entry".
>
> This is brain-dead behaviour.
>
> Duncan Murdoch
>

This was the behaviour that really scared me in Excel: saving as .csv 
loses any formatting (it is just an ascii file, how can it have formatting 
info?). Then opening in Excel (or it seems OO), the incorrect date 
interpretation occurs. If I then save the .csv I have erroneous data.

I often do just this sort of thing because I get given data in .xls, it 
has clunky column names or extraneous stuff so I alter it, save it as 
.csv. Then I get a data correction, some clarification of a value, so I 
want to go to the .csv to correct that data value. Once I do that if I am 
not *extremely* careful, before saving the .csv file, I have a problem.

David Scott
_________________________________________________________________
David Scott	Department of Statistics, Tamaki Campus
 		The University of Auckland, PB 92019
 		Auckland 1142,    NEW ZEALAND
Phone: +64 9 373 7599 ext 86830		Fax: +64 9 373 7000
Email:	d.scott at auckland.ac.nz

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics



More information about the R-help mailing list