[R] Truncating dates (and other date-time manipulations)
hadley wickham
h.wickham at gmail.com
Fri Sep 12 00:48:40 CEST 2008
> Hadley,
>
> What's wrong with:
>
> dates <- structure(c(8516, 8544, 8568, 8596, 8609, 8666, 8701, 8750,
> 8754, 8798, 8811, 8817, 8860, 8873, 8918, 8931,
> 8966, 9020, 9034, 9056), class = "Date")
>
>
The problem is this:
> as.Date(cut.Date(dates, "day"))
[1] "1993-04-26" "1993-05-24" "1993-06-17" "1993-07-15" "1993-07-28"
[6] "1993-09-23" "1993-10-28" "1993-12-16" "1993-12-20" "1994-02-02"
[11] "1994-02-15" "1994-02-21" "1994-04-05" "1994-04-18" "1994-06-02"
[16] "1994-06-15" "1994-07-20" "1994-09-12" "1994-09-26" NA
i.e. the series isn't complete if there isn't an observation on every
day - that NA on the end is worrying too. (And similarly with year
there are too many breaks, although that is easily fixed)
seq.Date(min(dates), max(dates), "days")
works, but
seq.Date(min(dates), max(dates), "years")
does not - because I want years to start on the first day of the year
- when you see 1994 on a graph you expect that it will refer to 1/1/94
not 26/4/94.
So cut and seq each do a bit of what I need, but not enough.
Combining the two comes pretty close:
start <- as.Date(cut.Date(min(dates), "year"))
end <- as.Date(cut.Date(max(dates), "year"))
seq.Date(start, end, "years")
but I need to always round the minimum down (floor) and the maximum up
(ceiling).
I need a method like:
fullseq <- function (range, size) {
seq(
round_any(range[1], size, floor),
round_any(range[2], size, ceiling),
by = size)
}
that works with dates (round_any is from the reshape package)
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list