[R] Bug in hist() when working with Dates ?
Martin Maechler
maechler at stat.math.ethz.ch
Tue Jun 2 10:05:21 CEST 2009
>>>>> "SN" == S Nunes <snunes at gmail.com>
>>>>> on Mon, 1 Jun 2009 16:45:54 +0100 writes:
SN> Hi, It seems that hist() has a buggy behavior when
SN> breaking over "days". The bug can be reproduced in a
SN> few steps:
>> d=data.frame(date=c("2009-01-01", "2009-01-02",
>> "2009-01-02")) d$date=as.Date(d$date) d$date
SN> [1] "2009-01-01" "2009-01-02" "2009-01-02"
>> h=hist(d$date, "days") h$count
SN> [1] 3
much simpler and less confusing is not going via data frame
(and 'plot=FALSE' suppresses the plot which is not the issue here) :
d. <- as.Date(c("2009-01-01", "2009-01-02", "2009-01-02"))
str(h <- hist(d., "days", plot=FALSE))
This does give what you observe.
It is not a bug as it is consistent with the default histogram
behavior:
> str(hist(c(1,2,2), breaks=1:2, plot=FALSE))
List of 7
$ breaks : int [1:2] 1 2
$ counts : int 3
$ intensities: num 1
$ density : num 1
$ mids : num 1.5
$ xname : chr "c(1, 2, 2)"
$ equidist : logi TRUE
- attr(*, "class")= chr "histogram"
SN> Despite the fact that the original data contains 2
SN> distinct days. The call hist() only returns one "break",
SN> adding the occurrences of both days. I would expect the
SN> last output to be: [1] 1 2.
as you see, your expectation is wrong.
It may help if you also use cut() (and read its help page)
and study the behavior of
'include.lowest' and 'right' arguments to both cut and hist.
SN> I am using R version 2.9.0.
SN> I would like to know if this behavior is correct or a
SN> bug?
"correct", as said above.
SN> Thanks in advance for your comments on this issue,
I agree that it may be useful if hist.Date(), the method used here,
would allow to easily produce what you expected,
when you'd use
str(h <- hist(d., "days", include.lowest=FALSE))
{which gives an error now},
namely to effectively use what you now can get via
str(hist(d., breaks= seq(min(d.)-1, max(d.), "days")))
{hint: the above is the solution for your problem}
(make check-devel) tested patches to hist.Date()
[in src/library/graphics/R/datetime.R] are welcome.
Martin Maechler, ETH Zurich
SN> -- Sergio Nunes
More information about the R-help
mailing list