[Rd] New strptime conversion specification for ordinal suffixes
Michael Chirico
michaelchirico4 at gmail.com
Wed Aug 31 16:10:47 CEST 2016
As touched on briefly on SO <http://stackoverflow.com/questions/39237299>,
base R has what appears to me to be a serious deficiency in its inability
to recognize dates formatted as character strings with ordinal suffixes:
ord_dates <- c("September 1st, 2016", "September 2nd, 2016",
"September 3rd, 2016", "September 4th, 2016")
?strptime lists no conversion specification which could match ord_dates in
one pass (as I discovered, even lubridate only manages to succeed by going
through the vector in several passes).
How difficult would it be to add a new conversion specification which would
handle this, which would seem to me to be a pretty common instance of dates
to be found in the raw data wild?
My suggestion would be %o for ordinal suffixes. These would obviously be
locale-specific, but in English, %o would match to:
- st
- nd
- rd
- th
- st
- nd
- rd
- th
Other languages may be covered by this
<https://en.wikipedia.org/wiki/Ordinal_indicator> and/or this
<https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts>
Wikipedia page on ordinal superscripts & Unicode superscripts, respectively.
With this implemented, converting ord_dates to a Date or POSIXct would be
as simple as:
as.Date(ord_dates, format = "%B %d%o, %Y")
Is there something on the C level preventing this from happening?
Michael Chirico
[[alternative HTML version deleted]]
More information about the R-devel
mailing list