[R-sig-Geo] complete linkage Agglomerative hierarchical clustering, nnclust, spatclus or something else?
Thomas Lumley
tlumley at u.washington.edu
Wed Apr 21 17:53:21 CEST 2010
You asked earlier about nnclust: it does single-linkage rather than complete-linkage clustering, that is, it defines clusters so that each point in the cluster has a nearest neighbour in the cluster closer than the threshold distance. This produces much less circular clusters than complete-linkage clustering.
The main distinctive feature of nnclust is that it is feasible even for quite large data sets, taking linear space and roughly nlogn time.
-thomas
On Wed, 21 Apr 2010, Hans Ekbrand wrote:
> On Wed, Apr 21, 2010 at 03:14:46PM +0200, Roger Bivand wrote:
>> On Wed, 21 Apr 2010, Hans Ekbrand wrote:
>
> [...]
>
>>> Well, hclust was useful, once I understood how cutree works. What
>>> would be the benefit of dnearneigh(), is it faster?
>>>
>>
>> For larger data sets, hclust needs a triangular distance matrix,
>> dnearneigh does not. Finding graph components in the output "nb" object
>> also seems conceptually more direct.
>
> OK, good to know if I run into trouble when using the code on larger
> data-sets later on.
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-sig-Geo
mailing list