[R] Merge with closest (not equal) time stamps
Naresh Gurbuxani
n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Tue Aug 8 13:12:32 CEST 2023
I was able to adapt your solution using packages with which I am more familiar.
myres2 <- merge(option.trades, stock.trades, by = "timestamp", all =
TRUE)
myres2[,"stock.timestamp"] <- ifelse(is.na(myres2$stock.price), NA,
myres2$timestamp)
myres2$stock.timestamp <- as.POSIXct(myres2$stock.timestamp, origin =
"1970-01-01")
library(zoo)
myres2$stock.price <- na.locf(myres2$stock.price)
myres2$stock.timestamp <- na.locf(myres2$stock.timestamp)
myres2 <- myres2[!is.na(myres2$option.price),]
row.names(myres2) <- NULL
all.equal(myres, myres2[,c(1, 2, 4, 3)]) # TRUE
This calculation is indeed faster.
Thanks for your help,
Naresh
> On Aug 8, 2023, at 5:39 AM, Eric Berger <ericjberger using gmail.com> wrote:
>
> Hi Naresh,
> Perhaps the below is faster than your approach
>
> library(dplyr)
> library(tidyr)
> merge(option.trades, stock.trades, by="timestamp", all=TRUE) |>
> dplyr::arrange(timestamp) |>
> dplyr::mutate(stock.timestamp =
> as.POSIXct(ifelse(is.na(option.price), timestamp, NA))) |>
> tidyr::fill(stock.price, stock.timestamp) |>
> dplyr::filter(!is.na(option.price)) |>
> dplyr::select(1,2,4,3) ->
> myres2
>
> identical(myres, myres2) ## TRUE
>
>
> On Tue, Aug 8, 2023 at 5:32 AM Naresh Gurbuxani
> <naresh_gurbuxani using hotmail.com> wrote:
>>
>>
>> I have two dataframes, each with a column for timestamp. I want to
>> merge the two dataframes such that each row from first dataframe
>> is matched with the row in the second dataframe with most recent but
>> preceding timestamp. Here is an example.
>>
>> option.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:22", "2023-08-07 10:25:33", "2023-08-07 10:28:41")), option.price = c(2.5, 2.7, 1.8))
>>
>> stock.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:21", "2023-08-07 10:23:34", "2023-08-07 10:24:57", "2023-08-07 10:28:37", "2023-08-07 10:29:01")), stock.price = c(102.2, 102.9, 103.1, 101.8, 101.7))
>>
>> stock.trades <- stock.trades[order(stock.trades$timestamp),]
>>
>> library(plyr)
>> mystock.prices <- ldply(option.trades$timestamp, function(tstamp) tail(subset(stock.trades, timestamp <= tstamp), 1))
>> names(mystock.prices)[1] <- "stock.timestamp"
>> myres <- cbind(option.trades, mystock.prices)
>>
>> This method works. But for large dataframes, it is very slow. Is there
>> a way to speed up the merge?
>>
>> Thanks,
>> Naresh
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list