[R] Simple clustering help

Uwe Ligges ligges at statistik.tu-dortmund.de
Sun Jul 17 16:43:41 CEST 2011



On 14.07.2011 19:59, donvolencia wrote:
> Hi all,
>
> I have just begun to use R and am hoping to receive some advice about the
> problem I need to solve.  I have a file containing xy points that I need to
> find all significant clusters and write each of their xy coordinates to
> file(total points  ~ 75000 and sig. cluster = 2500 points.  I want to use a
> euclidean distance threshold to determine if a point belongs to a cluster.
>
> My initial thought is to take a random (seed) point and write a region
> growing method to determine how many points belong to the cluster
> (basically, add to the cluster all points that are within the threshold and
> continue this until no points are with the threshold) .  Once there are no
> neighboring points within the threshold, if the number of points added to
> the region (cluster) is greater than 2500 I would write all of the point's
> coordinates to a text file and remove them from the list of seed candidates
> and begin again.  If the cluster size is less than 2500 I would simply
> remove the points as they are not significant.  The process would continue
> until there are less than 2500 points remaining.
>
>   Is there a package that would be helpful in this task?



You can do that simply in plain R. Anyway, I guess most of us wonder why 
you do not want to use one of the more established clustering methods 
that are available in R.

Uwe Ligges

> Thanks
> Don
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Simple-clustering-help-tp3668274p3668274.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list