I've recently stumbled across data.table, Matthew Dowle's package. I'm impressed by the speed of the package in handling operations with large data.frames, but am a bit overwhelmed with the syntax. I'd like to express the SQL statement below using data.table operations rather than sqldf (which was incredibly slow for a small subset of my financial data) or import/export with a DBMS, but I haven't been able to figure out how to do it. I would be grateful for your suggestions.
nick My aim is to join events (trades) from two datasets ("edt" and "cdt") where, for the same stock, the events in one dataset occur between 15 and 75 days before the other, and within the same time window. I can only see how to express the "WHERE e.SYMBOL = c.SYMBOL" part in data.table syntax. I'm also at a loss at whether I can express the remainder using data.table's %between% operator or not. ctqm <- sqldf("SELECT e.*, c.DATE 'DATEctrl', c.TIME 'TIMEctrl', c.PRICE 'PRICEctrl', c.SIZE 'SIZEctrl' FROM edt e, ctq c WHERE e.SYMBOL = c.SYMBOL AND julianday(e.DATE) - julianday(c.DATE) BETWEEN 15 AND 75 AND strftime('%H:%M:%S',c.TIME) BETWEEN strftime('%H:%M:%S',e.BEGTIME) AND strftime('%H:%M:%S',e.ENDTIME)") [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.