[R] Simple clustering help
Uwe Ligges
ligges at statistik.tu-dortmund.de
Sun Jul 17 16:43:41 CEST 2011
On 14.07.2011 19:59, donvolencia wrote:
> Hi all,
>
> I have just begun to use R and am hoping to receive some advice about the
> problem I need to solve. I have a file containing xy points that I need to
> find all significant clusters and write each of their xy coordinates to
> file(total points ~ 75000 and sig. cluster = 2500 points. I want to use a
> euclidean distance threshold to determine if a point belongs to a cluster.
>
> My initial thought is to take a random (seed) point and write a region
> growing method to determine how many points belong to the cluster
> (basically, add to the cluster all points that are within the threshold and
> continue this until no points are with the threshold) . Once there are no
> neighboring points within the threshold, if the number of points added to
> the region (cluster) is greater than 2500 I would write all of the point's
> coordinates to a text file and remove them from the list of seed candidates
> and begin again. If the cluster size is less than 2500 I would simply
> remove the points as they are not significant. The process would continue
> until there are less than 2500 points remaining.
>
> Is there a package that would be helpful in this task?
You can do that simply in plain R. Anyway, I guess most of us wonder why
you do not want to use one of the more established clustering methods
that are available in R.
Uwe Ligges
> Thanks
> Don
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Simple-clustering-help-tp3668274p3668274.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list