[R-sig-Geo] clustering spatial point data

Rolf Turner r.turner at auckland.ac.nz
Fri Jun 3 06:33:31 CEST 2011


On 03/06/11 04:49, Tal Avgar wrote:
> I am looking for a code/function/algorithm for clustering spatial point data
> into two distinct groups, based on spatial coordinates and a measure of a
> continuous response variable at these locations. The requirement is for
> group members to be as similar as possible in their affiliated response
> values but also group members must be clustered in space so that there are
> no events belonging to one group within the space affiliated with the other.
> Any ideas?
> Thanks,
> Tal.

I haven't yet seen any replies to your post, so I'll chip in with my
two cents (or less!) worth.

I think you need to be more explicit/specific as to how you wish
to form the clusters.  Clustering is usually based on some sort
of distance measure between the points.  Your distance measure
will need to be based both on the spatial distance between the
points and the difference in the values of ``the continuous response
variable''  (such a value is referred to in the trade as a numeric
*mark*) corresponding to a given point.

Once you've defined the distance measure you should be able
to create a ``distance matrix'' and then apply standard clustering
techniques (readily available in R) to that matrix.

One very naive approach would be just to use the Euclidean distance
between the triples (x_i, y_i, z_i) where x_i and y_i specify the point
locations and z_i is the numeric mark of the point in question.

Using Euclidean distance is almost surely *not* the right thing to do.
However you could try it, since it would be very easy to implement,
and see what it tells you.  You might thereby get some insight into
how to define the distance measure properly.

     cheers,

         Rolf Turner



More information about the R-sig-Geo mailing list