[R] sqldf + Date class. Ordering and summary statistics appear to be incorrect.

Grant Farnsworth gvfarns at gmail.com
Mon Jan 23 04:59:17 CET 2012


I've been using sqldf heavily lately but have encountered problems
with ordering of observations or calculating statistics such as max()
and min() when the variable used is of class Date.

For example, if I run the following code:

=============== begin code =================
library(sqldf)
A<-data.frame(Dates=as.Date(c("1994-02-14","1977-02-23","2001-09-18","2009-08-01")),Ret=rnorm(4))
OrderedA<-sqldf('select * from A order by Dates')
MaxA<-sqldf('select max(Dates) as Dates from A')[1,1]
MinA<-sqldf('select min(Dates) as Dates from A')[1,1]
=============== end code =================

Then the result is this:

> A
       Dates        Ret
1 1994-02-14  1.2414706
2 1977-02-23 -0.7728146
3 2001-09-18  1.2551331
4 2009-08-01 -0.2538359

> OrderedA
       Dates        Ret
1 2001-09-18  1.2551331
2 2009-08-01 -0.2538359
3 1977-02-23 -0.7728146
4 1994-02-14  1.2414706

> MaxA
[1] "1994-02-14"

> MinA
[1] "2001-09-18"

Completely wrong order, no warnings issued, and the summary stats are
wrong as well (but consistent with the ordering).

According to the sqldf manual found at the following URL

http://code.google.com/p/sqldf/#4._How_does_sqldf_work_with_%22Date%22_class_variables?

this type of query should work correctly.  Any clue why it is not
doing so?  User error or bug?

=================== debug info =================
> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] tcltk     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] sqldf_0.4-2           chron_2.3-42          gsubfn_0.5-7
[4] proto_0.3-9.2         RSQLite.extfuns_0.0.1 RSQLite_0.10.0
[7] DBI_0.2-5
================ end debug info =================



More information about the R-help mailing list