[R] read.table behavior for Dates.

Greg Snow Greg.Snow at imail.org
Fri Apr 16 22:26:50 CEST 2010

Part of the issue is that the creators of R and the read.table function make the unusual, but hopefully correct, assumption that you know more about your data than they do.  This gives you flexibility, power, and yes, responsibility.

What should read.table do if it sees an entry like: "04-10-12"?  Is that a date in USA format (month-day-year)?, a date in European format (day-month-year)?, or in sortable format (year-month-day)?, or is it a range variable (4 to 12 with 10 having meaning)?, or is it an id number?  Or something else?

If base R chooses one of the above, it may make life easier for you (at least for this project), but could end up being a nightmare for others (including yourself on another project).  One of the things I hate about Excel is that I will try to use it to set up a table for a client and try to put a range in as a row or column label, something like 1-5, and Excel clearly knowing better than me turns that into a date, 5-Jan.

You can write your own function that reads the data (maybe just the 1st few rows), looks for columns that look like dates, then convert (or read in the whole thing telling how to do the conversion).  That way you have complete control, can make any appropriate decisions, and not create headaches for other people (though possibly yourself if something changes).

Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Jeroen Ooms
> Sent: Friday, April 16, 2010 1:57 PM
> To: r-help at r-project.org
> Subject: Re: [R] read.table behavior for Dates.
> Yes I know I can manually do it, but I am using it for scripts in which
> users
> upload files. Hence, I don't know what's going to come; I don't know on
> before hand whether data will contain Dates, and in which columns they
> appear. This is why I was surprised that read.table has some
> (undocumented)
> behavior of converting columns that look like Dates to a different
> format.
> I would like to control this default behavior of the read.table
> function,
> rather than set a type for a specific column of a specific file.
> --
> View this message in context: http://n4.nabble.com/read-table-behavior-
> for-Dates-tp2013442p2013489.html
> Sent from the R help mailing list archive at Nabble.com.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list