[R] merging and obtaining the nearest value

Francesco cariboupad at gmx.fr
Sun Aug 19 13:00:51 CEST 2012


Dear Riu, Many thanks for your suggestion

However these are just simplified examples... in reality the dataset A
contains millions of observations and B several thousands of rows...
Could I still use a modified form of your suggestion?

Thanks

On 19 August 2012 12:51, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> Try the following.
>
>
> A <- read.table(text="
>
> TYPE   DATE
> A            2
> A            5
> A            20
> B            10
> B            2
> ", header = TRUE)
>
>
> B <- read.table(text="
>
> TYPE  Special_Date
> A              2
> A              6
> A              20
> A              22
> B              5
> B              6
> ", header = TRUE)
>
> result <- do.call( rbind, lapply(split(merge(A, B), list(m$DATE, m$TYPE)),
> function(x){
>         a <- abs(x$DATE - x$Special_Date)
>         if(nrow(x)) x[which(min(a) == a), ] }) )
> result$Difference <- result$DATE - result$Special_Date
> result$Special_Date <- NULL
> rownames(result) <- seq_len(nrow(result))
> result
>
>
> Also, it's a good practice to post data examples using dput(). For instance,
>
> dput(A)
> structure(list(TYPE = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A",
> "B"), class = "factor"), DATE = c(2L, 5L, 20L, 10L, 2L)), .Names = c("TYPE",
> "DATE"), class = "data.frame", row.names = c(NA, -5L))
>
> Now all we have to do is run the statement A <- structure(... etc...) to
> have an exact copy of the data example.
> Anyway, your example with input and the wanted result was very welcome.
>
> Hope this helps,
>
> Rui Barradas
>
> Em 19-08-2012 11:10, Francesco escreveu:
>>
>> Dear R-help
>>
>> Î would like to know if there is a short solution in R for this
>> merging problem...
>>
>> Let say I have a dataset A as:
>>
>> TYPE   DATE
>> A            2
>> A            5
>> A            20
>> B            10
>> B            2
>>
>> (there can be duplicates for the same type and date)
>>
>> and I have another dataset B as :
>>
>> TYPE  Special_Date
>> A              2
>> A              6
>> A              20
>> A              22
>> B              5
>> B              6
>>
>> The question is : I would like to obtain the difference between the
>> date of each observation in A and the closest special date in B with
>> the same type. In case of ties I would take the latest date of the
>> two.
>>
>> For example I would obtain here
>>
>> TYPE   DATE   Difference
>> A            2            0=2-2
>> A            5            -1=5-6
>> A            20            0=20-20
>> B            10           +4=10-6
>> B            2             -3=2-5
>>
>> Do you know how to (simply?) obtain this in R?
>>
>> Many thanks!
>> Best Regards
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list