[R] define number of clusters in kmeans/apcluster analysis

Giorgio Garziano giorgio.garziano at ericsson.com
Sun Dec 13 20:40:10 CET 2015


And in case you would like to explore the supervised clustering approach, I may suggest to explore the
use of knn() fed by a training set determined by your cluster assignments expectations.
Some "quick code" to show what I mean.

z <- as.data.frame(cbind(scale(x), scale(y)))
colnames(z) <- c("x", "y")

n <- nrow(z)
train <- seq(1,0.3*n,1)
ztrain <- as.data.frame(z[train,])
cl <- vector(mode="numeric", length=length(train))

for (i in 1:nrow(ztrain)) {
  if (ztrain[i,"y"] > 2 | ztrain[i,"x"] > 0) {
    cl[i] <- 2
  }  else {
    cl[i] <- 1
  }
}
plot(ztrain, col=cl)

library(class)
ztest<- as.data.frame(z[-train,])
knn.model <- knn(ztrain, ztest, cl, k = 3)
plot(ztest, col=knn.model)

ztrain$cl <- cl
ztest$cl <- knn.model

z.res <- rbind(ztrain,ztest)
plot(z.res$x, z.res$y, col=z.res$cl)

--
GG

	[[alternative HTML version deleted]]



More information about the R-help mailing list