[R] Find the ideal cluster
Jovani T. de Souza
jov@n|@ouz@5 @end|ng |rom gm@||@com
Sat Dec 12 16:27:18 CET 2020
So, I and some other colleagues developed a hierarchical clustering
algorithm to basically find the main clusters involving agricultural
industries according to a particular city (e.g. London city).. We
structured this algorithm in R. It is working perfectly. So, according to
our filters that we inserted in the algorithm, we were able to generate 6
clustering scenarios to London city. For example, the first scenario
generated 2 clusters, the second scenario 5 clusters, and so on. I would
therefore like some help on how I can choose the most appropriate one. I
saw that there are some packages that help in this process, like `pvclust`,
but I couldn't use it for my case. I am inserting a brief executable code
below to show the essence of what I want.
Any help is welcome! If you know how to use using another package, feel
free to describe.
Best Regards.
library(rdist)
library(geosphere)
library(fpc)
df<-structure(list(Industries = c(1,2,3,4,5,6),
+ Latitude = c(-23.8, -23.8, -23.9, -23.7,
-23.7,-23.7),
+ Longitude = c(-49.5, -49.6, -49.7, -49.8,
-49.6,-49.9),
+ Waste = c(526, 350, 526, 469, 534, 346)), class =
"data.frame", row.names = c(NA, -6L))
df1<-df
#clusters
coordinates<-df[c("Latitude","Longitude")]
d<-as.dist(distm(coordinates[,2:1]))
fit.average<-hclust(d,method="average")
clusters<-cutree(fit.average, k=2)
df$cluster <- clusters
> df
Industries Latitude Longitude Waste cluster
1 1 -23.8 -49.5 526 1
2 2 -23.8 -49.6 350 1
3 3 -23.9 -49.7 526 1
4 4 -23.7 -49.8 469 2
5 5 -23.7 -49.6 534 1
6 6 -23.7 -49.9 346 2
>
clusters1<-cutree(fit.average, k=5)
df1$cluster <- clusters1
> df1
Industries Latitude Longitude Waste cluster
1 1 -23.8 -49.5 526 1
2 2 -23.8 -49.6 350 1
3 3 -23.9 -49.7 526 2
4 4 -23.7 -49.8 469 3
5 5 -23.7 -49.6 534 4
6 6 -23.7 -49.9 346 5
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list