[R] Find the ideal cluster

Michael Dewey ||@t@ @end|ng |rom dewey@myzen@co@uk
Sat Dec 12 18:06:33 CET 2020


Dear Jovani

If you cross-post on CrossValidated as well as here it is polite to give 
a link so people do not answer here when someone has already answered 
there, or vice versa.

Michael

On 12/12/2020 15:27, Jovani T. de Souza wrote:
> So, I and some other colleagues developed a hierarchical clustering
> algorithm to basically find the main clusters involving agricultural
> industries according to a particular city (e.g. London city).. We
> structured this algorithm in R. It is working perfectly. So, according to
> our filters that we inserted in the algorithm, we were able to generate 6
> clustering scenarios to London city. For example, the first scenario
> generated 2 clusters, the second scenario 5 clusters, and so on. I would
> therefore like some help on how I can choose the most appropriate one. I
> saw that there are some packages that help in this process, like `pvclust`,
> but I couldn't use it for my case. I am inserting a brief executable code
> below to show the essence of what I want.
> 
> Any help is welcome! If you know how to use using another package, feel
> free to describe.
> 
> Best Regards.
> 
> 
>      library(rdist)
>      library(geosphere)
>      library(fpc)
> 
> 
>      df<-structure(list(Industries = c(1,2,3,4,5,6),
>      +                    Latitude = c(-23.8, -23.8, -23.9, -23.7,
> -23.7,-23.7),
>      +                    Longitude = c(-49.5, -49.6, -49.7, -49.8,
> -49.6,-49.9),
>      +                    Waste = c(526, 350, 526, 469, 534, 346)), class =
> "data.frame", row.names = c(NA, -6L))
> 
>      df1<-df
> 
>      #clusters
>      coordinates<-df[c("Latitude","Longitude")]
>      d<-as.dist(distm(coordinates[,2:1]))
>      fit.average<-hclust(d,method="average")
> 
>      clusters<-cutree(fit.average, k=2)
>      df$cluster <- clusters
>      > df
>        Industries Latitude Longitude Waste cluster
>      1          1    -23.8     -49.5   526       1
>      2          2    -23.8     -49.6   350       1
>      3          3    -23.9     -49.7   526       1
>      4          4    -23.7     -49.8   469       2
>      5          5    -23.7     -49.6   534       1
>      6          6    -23.7     -49.9   346       2
>      >
>      clusters1<-cutree(fit.average, k=5)
>      df1$cluster <- clusters1
>      > df1
>        Industries Latitude Longitude Waste cluster
>      1          1    -23.8     -49.5   526       1
>      2          2    -23.8     -49.6   350       1
>      3          3    -23.9     -49.7   526       2
>      4          4    -23.7     -49.8   469       3
>      5          5    -23.7     -49.6   534       4
>      6          6    -23.7     -49.9   346       5
>      >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



More information about the R-help mailing list