[R] Excel
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Sep 6 19:18:02 CEST 2007
On my version of Excel (Excel 2007 under Vista) using
File | Open on a file, a.txt such as:
a b
sep7 10
sep10 11
causes it to enter a wizard where it asks you for the delimiters and
column types so you can change it from what it offers as the default.
In particular, if you leave it at General it will guess Date but you can
specify Text or you can specify Date to cause it to select a
particular type.
On 9/6/07, J.delasHeras at ed.ac.uk <J.delasHeras at ed.ac.uk> wrote:
> Quoting Robert A LaBudde <ral at lcfltd.com>:
>
> > If you format the column as "Text", you won't have this problem. By
> > leaving the cells as "General", you leave it up to Excel to guess at
> > the correct interpretation.
> >
> > You will note that the conversion to a date occurs immediately in
> > Excel when you enter the value. There are many formats to enter dates.
> >
> > Either pre-format the column as Text, or prefix the individual entry
> > with an ' to indicate text.
>
> But the conversion is done as soon as the file is opened, _before_ you
> have the chance to format the column as text!!!
> Once the conversion is done... it's done.
> I had gene names such as "SEP7" converted by Excel into a 5 digit
> number representing a date. From that number I didn't find a way to
> reconstruct "SEP7". "Sept-7" is not the same.
>
> It seems like a problem with an easy solution. But it isn't. There are
> too many variations.
>
> > A similar problem occurs in R's read.table() function when a factor
> > has levels that can be interpreted as numbers.
>
> at least with read.table you can specify the classes of each column
> _before_ you read the file.
>
> R developers are better behaved than MS Excel ones ;-)
>
> Jose
>
> >
> > At 10:11 PM 8/27/2007, David wrote:
> >
> >> A common process when data is obtained in an Excel spreadsheet is to save
> >> the spreadsheet as a .csv file then read it into R. Experienced users
> >> might have learned to be wary of dates (as I have) but possibly have not
> >> experienced what just happened to me. I thought I might just share it with
> >> r-help as a cautionary tale.
> >>
> >> I received an Excel file giving patient details. Each patient had an ID
> >> code in the form of three letters followed by four digits. (Actually a New
> >> Zealand National Health Identification.) I saved the .xls file as .csv.
> >> Then I opened up the .csv (with Excel) to look at it. In the column of ID
> >> codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.
> >>
> >> In a column of character data, Excel had interpreted AUG2699 as a date.
> >>
> >> The .csv did not actually have a date in that cell, but if I had saved the
> >> .csv file it would have.
> >>
> >> David Scott
> >
> > ================================================================
> > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com
> > Least Cost Formulations, Ltd. URL: http://lcfltd.com/
> > 824 Timberlake Drive Tel: 757-467-0954
> > Virginia Beach, VA 23464-3239 Fax: 757-467-2947
> >
> > "Vere scire est per causas scire"
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> --
> Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
> Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list