[R] nested for() loops for returning a nearest point
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Jul 30 19:10:53 CEST 2003
For largish datasets, knn1 in package class (in the recommended VR bundle)
is probably the quickest way to do this. Something like
knn1(D1[. 1:2], D2[, 1:2], D2$ID)
On Wed, 30 Jul 2003, Roger Bivand wrote:
> On Wed, 30 Jul 2003, Steve Sullivan wrote:
>
> > I'm trying to do the following:
> >
> >
> >
> > For each ordered pair of a data frame (D1) containing longitudes and
> > latitudes and unique point IDs, calculate the distance to every point in
> > another data frame (D2) also containing longitudes, latitudes and point
> > IDs, and return to a new variable in D1 the point ID of the nearest
> > element of D2.
>
> I think you can get quite a long way with the function rdist.earth() in
> the fields package:
>
> > loc1 <- expand.grid(long=seq(-150,150,5), lat=seq(-70,70,5))
> > dim(loc1)
> [1] 1769 2
> > loc2 <- expand.grid(long=seq(-150,150,7.5), lat=seq(-70,70,7.5))
> > dim(loc2)
> [1] 779 2
> > dists <- rdist.earth(loc1, loc2)
> > id12 <- apply(dists, 1, which.min)
> > length(id12)
> [1] 1769
> > id21 <- apply(dists, 2, which.min)
> > length(id21)
> [1] 779
>
> using id12 and id21 to choose the point.ids if need be
>
> > loc2$point.id[id12]
>
> Roger
>
> >
> > Dramatis personae (mostly self-explanatory):
> >
> > D1$long
> >
> > D1$lat
> >
> > D1$point.id
> >
> > neighbor.id (to be created; for each ordered pair in D1 the point ID of
> > the nearest ordered pair in D2)
> >
> > D2$long
> >
> > D2$lat
> >
> > D2$point.id
> >
> > dist.geo (to be created)
> >
> >
> >
> > I've been attempting this with nested for() loops that step through each
> > ordered pair in D1, and for each ordered pair [i] in D1 create a vector
> > (dist.geo) the length of D2$lat (say) that contains the distance
> > calculated from every ordered pair in D2 to the current ordered pair [i]
> > of D1, assign a value for D1$neighbor.id[i] based on
> > D2$point.id[(which.min(dist.geo)], and move on to the next ordered pair
> > of D1 to create another dist.geo, assign another neighbor.id, etc.
> >
> >
> >
> > There are no missings/NAs in any of the longs, lats or point.ids,
> > although advice on generalizing this to deal with them would be
> > appreciated.
> >
> >
> >
> > What I've been trying:
> >
> >
> >
> > neighbor.id <- vector(length=length(D1$lat))
> > dist.geo <- vector(length=length(D2$lat))
> > for(i in 1:length(neighbor.id)){
> > for(j in 1:length(dist.geo)){
> > dist.geo[j] <- D1$lat[i]-D2$lat[j]}
> >
> > # Yes, I know that isn't the right formula, this is just a test
> >
> > neighbor.id[i] <- D2$point.id[which.min(dist.geo)]}
> >
> >
> >
> > What I get is a neighbor.id of the appropriate length, but which
> > consists only of the same value repeated. Should I instead pass the
> > which.min(dist.geo) to a variable before exiting the inner (j) loop, and
> > reference that variable in place of which.min(dist.geo) in the last
> > line? Or is this whole approach wrongheaded?
> >
> >
> >
> > This should be elementary, I know, so I appreciate everyone's
> > forbearance.
> >
> >
> >
> > Steven Sullivan, Ph.D.
> >
> > Senior Associate
> >
> > The QED Group, LLC
> >
> > 1250 Eye St. NW, Suite 802
> >
> > Washington, DC 20005
> >
> > ssullivan at qedgroupllc.com
> >
> > 202.898.1910.x15 (v)
> >
> > 202.898.0887 (f)
> >
> > 202.421.8161 (m)
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list