[R] translating SQL statements into data.table operations
Gabor Grothendieck
ggrothendieck at gmail.com
Wed Mar 24 23:29:23 CET 2010
Note that, in general, you can speed up joins, even within sqldf, by
adding indexes to your tables and ensuring that your select statement
is written in such a way that the indexes are used. See example 4i on
the sqldf home page.
On Wed, Mar 24, 2010 at 4:51 PM, Nick Switanek <nswitanek at gmail.com> wrote:
> I've recently stumbled across data.table, Matthew Dowle's package. I'm
> impressed by the speed of the package in handling operations with large
> data.frames, but am a bit overwhelmed with the syntax. I'd like to express
> the SQL statement below using data.table operations rather than sqldf (which
> was incredibly slow for a small subset of my financial data) or
> import/export with a DBMS, but I haven't been able to figure out how to do
> it. I would be grateful for your suggestions.
>
> nick
>
>
>
> My aim is to join events (trades) from two datasets ("edt" and "cdt") where,
> for the same stock, the events in one dataset occur between 15 and 75 days
> before the other, and within the same time window. I can only see how to
> express the "WHERE e.SYMBOL = c.SYMBOL" part in data.table syntax. I'm also
> at a loss at whether I can express the remainder using data.table's
> %between% operator or not.
>
> ctqm <- sqldf("SELECT e.*,
> c.DATE 'DATEctrl',
> c.TIME 'TIMEctrl',
> c.PRICE 'PRICEctrl',
> c.SIZE 'SIZEctrl'
>
> FROM edt e, ctq c
>
> WHERE e.SYMBOL = c.SYMBOL AND
> julianday(e.DATE) - julianday(c.DATE) BETWEEN 15 AND
> 75 AND
> strftime('%H:%M:%S',c.TIME) BETWEEN
> strftime('%H:%M:%S',e.BEGTIME) AND strftime('%H:%M:%S',e.ENDTIME)")
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list