[R] merging and obtaining the nearest value
Francesco
cariboupad at gmx.fr
Sun Aug 19 13:00:51 CEST 2012
Dear Riu, Many thanks for your suggestion
However these are just simplified examples... in reality the dataset A
contains millions of observations and B several thousands of rows...
Could I still use a modified form of your suggestion?
Thanks
On 19 August 2012 12:51, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> Try the following.
>
>
> A <- read.table(text="
>
> TYPE DATE
> A 2
> A 5
> A 20
> B 10
> B 2
> ", header = TRUE)
>
>
> B <- read.table(text="
>
> TYPE Special_Date
> A 2
> A 6
> A 20
> A 22
> B 5
> B 6
> ", header = TRUE)
>
> result <- do.call( rbind, lapply(split(merge(A, B), list(m$DATE, m$TYPE)),
> function(x){
> a <- abs(x$DATE - x$Special_Date)
> if(nrow(x)) x[which(min(a) == a), ] }) )
> result$Difference <- result$DATE - result$Special_Date
> result$Special_Date <- NULL
> rownames(result) <- seq_len(nrow(result))
> result
>
>
> Also, it's a good practice to post data examples using dput(). For instance,
>
> dput(A)
> structure(list(TYPE = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A",
> "B"), class = "factor"), DATE = c(2L, 5L, 20L, 10L, 2L)), .Names = c("TYPE",
> "DATE"), class = "data.frame", row.names = c(NA, -5L))
>
> Now all we have to do is run the statement A <- structure(... etc...) to
> have an exact copy of the data example.
> Anyway, your example with input and the wanted result was very welcome.
>
> Hope this helps,
>
> Rui Barradas
>
> Em 19-08-2012 11:10, Francesco escreveu:
>>
>> Dear R-help
>>
>> Î would like to know if there is a short solution in R for this
>> merging problem...
>>
>> Let say I have a dataset A as:
>>
>> TYPE DATE
>> A 2
>> A 5
>> A 20
>> B 10
>> B 2
>>
>> (there can be duplicates for the same type and date)
>>
>> and I have another dataset B as :
>>
>> TYPE Special_Date
>> A 2
>> A 6
>> A 20
>> A 22
>> B 5
>> B 6
>>
>> The question is : I would like to obtain the difference between the
>> date of each observation in A and the closest special date in B with
>> the same type. In case of ties I would take the latest date of the
>> two.
>>
>> For example I would obtain here
>>
>> TYPE DATE Difference
>> A 2 0=2-2
>> A 5 -1=5-6
>> A 20 0=20-20
>> B 10 +4=10-6
>> B 2 -3=2-5
>>
>> Do you know how to (simply?) obtain this in R?
>>
>> Many thanks!
>> Best Regards
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list