# [R] finding euclidean proximate points in two datasets

David Winsemius dwinsemius at comcast.net
Thu May 20 16:18:10 CEST 2010

```On May 20, 2010, at 10:02 AM, Alexander Shenkin wrote:

> Hello all,
>
> I've been pouring through the various spatial packages, but haven't
> come
> across the right thing yet.

There is a SIG for such questions.

>
> Given a set of points in 2-d space X, i'm trying to find the subset of
> points in Y proximate to each point in X.  Furthermore, the proximity
> threshold of each point in X differs (X\$threshold).  I've constructed
> this myself already, but it's horrificly slow with a dataset of 40k+
> points in one set, and a 700 in the other.
>
> A very inefficient example of what I'm looking for:

Not really a reproducible example. If euclidean_dist is a function ,
then it is not one in any of the packages I have installed.

>
>    for (pt in X\$idx) {
>        proximity[i] = euclidian_dist(X[pt]\$x, X[pt]\$y, Y\$x, Y\$y) <
> X\$threshold
> 	i = i+1
>    }
>

Have you considered first creating a subset of candidate points that
are within "threshold" of each reference point on both coordinates.
That might sidestep a lot of calculations on points that are easily
eliminated on a single comparison. Then you could calculate distances
within that surviving subset of points. On average that should give
you an over 50% "hit rate":

> (4/3)*pi*0.5^3
 0.5235988

> Perhaps crossdist() in spatstat is what I should use, and then code a
> comparison with X\$threshold after the cross-distances are computed.
> However, I was wondering if there was another tool I should be
> considering.  Any and all thoughts are very welcome.  Thanks in
>
> Thanks,
> Allie
> --
> Alexander Shenkin
> PhD Candidate
> School of Natural Resources and Environment
> University of Florida
--
David Winsemius, MD
West Hartford, CT

```