[R] Transport and Earth Mover's Distance
Schuhmacher, Dominic
dominic.schuhmacher at mathematik.uni-goettingen.de
Thu Mar 9 17:44:27 CET 2017
> Am 08.03.2017 um 11:28 schrieb Schuhmacher, Dominic <dominic.schuhmacher at mathematik.uni-goettingen.de>:
>
> ...
>>>
>>> If you have no particular need for binning, check out the function
>>> pppdist in the R-package spatstat, which offers a more flexible way
>>> to deal with point patterns of different size.
>>
>>
>> Well, this is not clear, but possibly very important for me.
>> My raw data consists of 2 univariate samples of unequal length.
>>
>> suppose that
>>
>> x<-rnorm(100)
>>
>> and
>>
>> y<-rnorm(90)
>>
>> Is there a way to define the Wasserstein distance between them without
>> going through the binning procedure?
>>
> Define, yes: the 1-Wasserstein distance in one-dimension is the area between the empirical cumulative distribution functions. If the samples had the same lengths this could be directly computed by
>
> mean(abs(sort(x)-sort(y)))
>
> Otherwise this needs some lines of code. I will include it in the next version of the transport package (soon).
>
> Best regards,
> Dominic
>
>
Following up on this earlier post: transport 0.8-2, which is on CRAN now, offers the possibility to compute the Wasserstein distance between univariate samples of differing lengths (more precisely their empirical distributions).
library(transport)
x <- rnorm(100)
y <- rnorm(90)
wasserstein1d(x,y)
Cheers, Dominic
------------------------------------
Dominic Schuhmacher
Professor of Stochastics
University of Goettingen
http://www.dominic.schuhmacher.name
More information about the R-help
mailing list