[R] Plots with k-means

David Winsemius dwinsemius at comcast.net
Mon Nov 2 22:35:28 CET 2009


The attached file did not come through to the list. I think you have  
some non-standard characters (or at least non-standard in my locale).  
I was able to get the code to run after using the Zap Gremlins  
function in TextWrangler. Prior to that "treatment" pretty much every  
line threw an error of this sort:

 > setClass(Class = 'POI',
+        representation(matrizSim = 'matrix',cos.query.docs = 'vector',
Error: unexpected input in:
"setClass(Class = 'POI',
¬"
 >      wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
Error: unexpected input in "¬"
 > = 'matrix',
Error: unexpected '=' in "="
 >      Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
Error: unexpected input in "¬"
 > newcoords = 'numeric' ,
Error: unexpected ',' in "newcoords = 'numeric' ,"
 >      newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
Error: unexpected input in "¬"

I also needed to remove a couple of spaces between function names and  
parentheses when these occurred at ends-of-lines. Attached is a  
working version as a .txt file (which should make it through the list- 
serv:

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: kmeans.mapper.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091102/4637ca2c/attachment-0002.txt>
-------------- next part --------------



--  
David.
 > sessionInfo()
R version 2.10.0 Patched (2009-10-29 r50258)
x86_64-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets   
methods   base

other attached packages:
[1] rms_2.1-0       Hmisc_3.7-0     survival_2.35-7

loaded via a namespace (and not attached):
[1] cluster_1.12.1  grid_2.10.0     lattice_0.17-26




On Nov 2, 2009, at 3:43 PM, eduardo san miguel wrote:

> I send r-code in an attached file.
>
> 2009/11/2 Iuri Gavronski <iuri at proxima.adm.br>:
>> Eduardo,
>>
>> Would you mind sending me the R code in an attached file. Your code  
>> didn't
>> work here and I am not sure it is because of line breaks from the  
>> email
>> program.
>>
>> Iuri.
>>
>> On Mon, Nov 2, 2009 at 10:53 AM, eduardo san miguel <
> eduardosanmi at gmail.com>
>> wrote:
>>>
>>> Hello all,
>>>
>>> I have almost finished the development of a new package where ideas
>>> from Tamara Munzner, George Furnas and Costa and Venturini are
>>> implemented.
>>>
>>> 1.- Da Costa, David & Venturini, Gilles (2006). An Interactive
>>> Visualization Environment for Data Exploration Using Points of
>>> Interest. adma 2006: 416-423
>>>
>>> 2.- Furnas, George (1986). Generalized Fisheye Views. Human  
>>> Factors in
>>> computing systems, CHI '86 conference proceedings, ACM, New York,  
>>> pp.
>>> 16-23.
>>>
>>> 3.- Heidi Lam, Ronald A. Rensink, and Tamara Munzner (2006). Effects
>>> of 2D Geometric Transformations on Visual Memory. Proc. Applied
>>> Perception in Graphics and Visualization (APGV 2006), 119-126, 2006.
>>>
>>> 4.- Keith Lau, Ron Rensink, and Tamara Munzner (2004). Perceptual
>>> Invariance of Nonlinear Focus+Context Transformations. Proc. First
>>> Symposium on Applied Perception in Graphics and Visualization (APGV
>>> 04) 2004, pp 65-72.
>>>
>>> This is a sample with some basic functionality and a VERY BASIC
>>> example with kmeans plotting.
>>>
>>> Comments will be greatly appreciated.
>>>
>>> Regards
>>>
>>> -- R CODE
>>> require(methods)
>>>
>>> setClass(Class = 'POI',
>>>       representation(matrizSim = 'matrix',cos.query.docs = 'vector',
>>>     wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
>>> = 'matrix',
>>>     Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
>>> newcoords = 'numeric' ,
>>>     newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
>>> 'character' , colores = 'vector' ,
>>>     poisCircleCol = 'character' , linesCol = 'character', itemsCol =
>>> 'character',
>>>     LABELS =  'logical',  vscale = 'numeric',  hscale = 'numeric',
>>> circleCol = 'character',
>>>     plotCol = 'character',  itemsFamily = 'character',  lenteDefault
>>> = 'numeric',
>>>     zoomDefault = 'numeric' ,  rateDefault = 'numeric' ,
>>> topKDefault = 'numeric'  ,
>>>     pal = 'character',  selected = 'numeric' ,  circRadio =
>>> 'numeric' , IncVscale = 'numeric',
>>>     cgnsphrFont = 'numeric', xClick_old = 'numeric',  yClick_old =
>>> 'numeric',
>>>     wordsInQueryFull = 'character' ),
>>>     prototype(cos.query.docs = 0, colores = 0, newcoords = 0,
>>> newcoords_1 = 0, M = 3,
>>>              vscale = 0.5 , hscale = 1.5 , circleCol = 'black' ,
>>> itemsCol = 'white',
>>>              poisTextCol =  '#fff5ee',  poisCircleCol = '#fff5ee',
>>> linesCol = 'white',
>>>              plotCol = 'black', itemsFamily = 'sans', lenteDefault =
>>> 1, zoomDefault = 15 ,
>>>              rateDefault = 0.1 , topKDefault = 25,  pal = 'topo' ,
>>> selected = 1 ,
>>>              circRadio = 0.25  , IncVscale = 0.05  ,  cgnsphrFont =
>>> 1.01, LABELS = T)
>>> )
>>>
>>> setGeneric("puntosMedios" ,
>>>            function(Pcoords, detalle = 5){standardGeneric
>>> ("puntosMedios")})
>>>
>>> setMethod("puntosMedios" ,
>>>           signature = "matrix",
>>>           function(Pcoords, detalle = 5){
>>>
>>> for (i in 1:detalle){
>>>   new_pcoords = matrix(rep(0,4*nrow(Pcoords)), nrow = 2*
>>> nrow(Pcoords), byrow = T )
>>>   cont = 0
>>>   for (i in 1:nrow(Pcoords)){
>>>          if (i == nrow(Pcoords)) {
>>>               cont = cont + 1
>>>               new_pcoords[cont,] = Pcoords[i,]
>>>               cont = cont + 1
>>>               new_pcoords[cont,] = Pcoords[i,] -
>>> ((Pcoords[i,]-Pcoords[1,])/2)
>>>       }else{
>>>               cont = cont + 1
>>>               new_pcoords[cont,] = Pcoords[i,]
>>>               cont = cont + 1
>>>               new_pcoords[cont,] = Pcoords[i,] -
>>> ((Pcoords[i,]-Pcoords[i+1,])/2)}}
>>>   Pcoords = new_pcoords}
>>>   return(Pcoords)
>>>
>>>  }
>>> )
>>>
>>> setGeneric("fishIout" ,
>>>            function(x, value){standardGeneric ("fishIout")})
>>>
>>> setMethod("fishIout" ,
>>>           signature = "numeric",
>>>           function(x, value){
>>>
>>> d = value
>>>       if (x > 0){
>>>               signo = 1
>>>       }else{
>>>               signo = -1
>>>       }
>>>       x = abs(x)
>>>       return(signo*(-(x/((d*x)-d-1))))
>>>  }
>>> )
>>>
>>> setGeneric("fishIin" ,
>>>            function(x, value){standardGeneric ("fishIin")})
>>>
>>> setMethod("fishIin" ,
>>>           signature = "numeric",
>>>           function(x, value){
>>>
>>> d = value
>>>       if (x > 0){
>>>               signo = 1
>>>       }else{
>>>               signo = -1
>>>       }
>>>       x = abs(x)
>>>
>>>       return(signo*(((d+1)*x)/(d*x+1)))
>>>  }
>>> )
>>>
>>> setGeneric("toPolar" ,
>>>            function(x, y){standardGeneric ("toPolar")})
>>>
>>> setMethod("toPolar" ,
>>>           signature = "numeric",
>>>           function(x, y){
>>>
>>>       t1 = atan2(y,x)
>>>       rP = sqrt(x^2+y^2)
>>>       return(c(t1 = t1,rP = rP))
>>>
>>>  }
>>> )
>>>
>>> setGeneric("toCartesian" ,
>>>            function(t1, rP){standardGeneric ("toCartesian")})
>>>
>>> setMethod("toCartesian" ,
>>>           signature = "numeric",
>>>           function(t1, rP){
>>>
>>>       x1 = rP*cos(t1)
>>>       y1 = rP*sin(t1)
>>>       return(c(x = x1,y = y1))
>>>
>>>  }
>>> )
>>>
>>> setGeneric("circulo" ,
>>>            function(cx, cy, r, circleCol, PLOT =
>>> TRUE){standardGeneric ("circulo")})
>>>
>>> setMethod("circulo" ,
>>>           signature = "numeric",
>>>           function(cx, cy, r, circleCol, PLOT = TRUE){
>>>
>>>       t = seq(0,2*pi,length=100)
>>>       circle = t(rbind(cx+sin(t)*r,cy+cos(t)*r))
>>>       if (PLOT == TRUE)
>>> plot(circle,type='l',,ylim=c(-1.15,1.15),xlim=c(-1.15,1.15),
>>>               ann=FALSE, axes=F, col = circleCol)
>>>       return(circle)
>>>
>>>  }
>>> )
>>>
>>> setGeneric("circulin" ,
>>>            function(cx, cy, r = 0.045,
>>>                     objeto, col = 'blue', PLOT = TRUE, label = 0){
>>>                     standardGeneric ("circulin")})
>>>
>>> setMethod("circulin" ,
>>>           signature = "ANY",
>>>           function(cx, cy, r = 0.045, objeto, col = 'blue', PLOT =
>>> TRUE, label = 0){
>>>
>>>       t = seq(0,2*pi,length=100)
>>>       circle = t(rbind(cx+sin(t)*r,cy+cos(t)*r))
>>>       points(circle,type='l', col = col)
>>>       if (label != 0) text(cx,cy,label,cex = .7)
>>>       insiders <-
>>> apply(objeto,1,function(co)(cx-co[1])^2+(cy-co[2])^2<r^2)
>>> assign('insiders', insiders , envir = POI.env)
>>>
>>>  }
>>> )
>>>
>>> setGeneric("addNoise" ,
>>>            function(m, tamanyo = 0.01){standardGeneric  
>>> ("addNoise")})
>>>
>>> setMethod("addNoise" ,
>>>           signature = "matrix",
>>>           function(m, tamanyo = 0.01){
>>>
>>>       noise = function(m, t = tamanyo){
>>>               ruido = rnorm(length(m), 0,t)
>>>               return(m+ruido)
>>>       }
>>>       noised = noise(m)
>>>       unicos = which(duplicated(m) == FALSE)
>>>       m[-unicos,] = noised[-unicos,]
>>>       return(m)
>>>
>>>  }
>>> )
>>>
>>> setGeneric("toHiperbolico" ,
>>>            function(objeto, M = 1 , cx = 0, cy = 0, r = 1){
>>>            standardGeneric ("toHiperbolico")})
>>>
>>> setMethod("toHiperbolico" ,
>>>           signature = "matrix",
>>>           function(objeto, M = 1 , cx = 0, cy = 0, r = 1){
>>>
>>>       insiders =
>>> apply(objeto,1,function(co)(cx-co[1])^2+(cy-co[2])^2<r^2)
>>>       outers = which(insiders < 1)
>>>       objetoP = matrix(toPolar(objeto[,1],objeto[,2]),nc=2)
>>>       if (length(outers)){
>>>                       objetoP[outers,2] = 1
>>>       }
>>>       objetoP[,2] = sapply(objetoP[,2],fishIin,M)
>>>       objetoC = matrix(toCartesian(objetoP[,1],objetoP[,2]),nc=2)
>>> return(list(objetoC = objetoC,
>>>             objetoP = objetoP))
>>>
>>>  }
>>> )
>>>
>>> setGeneric("POIcoords<-" , function(object, value){standardGeneric
>>> ("POIcoords<-")})
>>>
>>> setReplaceMethod( f ="POIcoords",
>>>                  signature = 'POI',
>>>                  definition = function(object, value){
>>>                                  object at Pcoords <- value$Pcoords
>>>                                  object at PcoordsFI <- value$PcoordsFI
>>>                                  object at newPcoords <- value 
>>> $newPcoords
>>>                                  object at objeto <- value$objeto
>>>
>>>                                  return(object)
>>>                               }
>>> )
>>>
>>> setGeneric("POICalc" ,
>>>            function(objeto, NC, cx=0, cy=0, r=1,
>>> ...){standardGeneric ("POICalc")})
>>>
>>> setMethod("POICalc" ,
>>>           signature = "POI",
>>>           function(objeto, NC, cx=0, cy=0, r=1, ...){
>>>
>>>  MatrizSim = objeto at matrizSim
>>>  secuencia = seq(2/NC,2,2/NC)
>>>  Pcoords = matrix(rep(0,NC*2),nc=2)
>>>  n = 1
>>>  for (i in secuencia){
>>>     Pcoords[n,] = c(r * cos(i*pi), r * sin(i*pi))
>>>     n = n+1
>>>  }
>>>  PcoordsFI = matrix(toPolar(Pcoords[,1],Pcoords[,2]),nc=2)
>>>  PcoordsFI[,2] = PcoordsFI[,2]+.15
>>>  PcoordsFI = matrix(toCartesian(PcoordsFI[,1],PcoordsFI[,2]),nc=2)
>>>
>>>  if (nrow(Pcoords) != 1){
>>>  newPcoords = puntosMedios(Pcoords)
>>>  } else {
>>>     newPcoords = Pcoords
>>>  }
>>>
>>>  MatrizSim[is.nan(MatrizSim/rowSums(MatrizSim))] <- 0
>>>
>>>  W = MatrizSim / rowSums(MatrizSim)
>>>  W[is.nan(W)] <- 0
>>>  nwords = nrow(W)
>>>  objeto = matrix(rep(0,2*nwords),nc=2)
>>>  for (j in 1:nwords){
>>>     for (nPOI in 1:NC){
>>>        objeto[j,1] = objeto[j,1]+(W[j,nPOI]*Pcoords[nPOI,1])
>>>        objeto[j,2] = objeto[j,2]+(W[j,nPOI]*Pcoords[nPOI,2])
>>>     }
>>>  }
>>>
>>>  objeto = addNoise(objeto)
>>>
>>>  return(list(Pcoords = Pcoords,
>>>              PcoordsFI = PcoordsFI,
>>>              newPcoords = newPcoords,
>>>              objeto = objeto))
>>>
>>>  }
>>> )
>>>
>>> setGeneric("POIPlot" ,
>>>            function(POI){standardGeneric ("POIPlot")})
>>>
>>> setMethod("POIPlot" ,
>>>           signature = "POI",
>>>           function(POI){
>>>
>>>  par(bg=POI at plotCol, mar = c(0.1,0.1,0.1,0.1), family =  
>>> POI at itemsFamily)
>>>
>>>
>>>  if (exists('POI.env')) {
>>>     if (exists('POI', envir = POI.env)) {
>>>       POI <- get('POI', envir = POI.env)
>>>     }
>>>  }
>>>
>>>  selected = POI at selected
>>>  objeto = POI at objeto
>>>  newcoords = POI at newcoords
>>>  newcoords_1 = POI at newcoords_1
>>>  NC = length(POI at wordsInQuery)
>>>  cx=0
>>>  cy=0
>>>  r=1
>>>  etiq2 = POI at docs[,1]
>>>  etiq = POI at wordsInQuery
>>>  fishEYE = TRUE
>>>  M = POI at M
>>>  poisTextCol = POI at poisTextCol
>>>  colores = POI at colores[POI at docs]
>>>  poisCircleCol = POI at poisCircleCol
>>>  linesCol = POI at linesCol
>>>  itemsCol = POI at itemsCol
>>>  circleCol = POI at circleCol
>>>  LABELS =  POI at LABELS
>>>  Pcoords = POI at Pcoords
>>>  newPcoords = POI at newPcoords
>>>  cgnsphrFont = POI at cgnsphrFont
>>>
>>>  newcoords_par = newcoords
>>>
>>>  newcoords_Pcoords = matrix(rep( c(newcoords,newcoords_1 ),
>>>                             nrow(Pcoords)),nc=2,byrow=TRUE)
>>>
>>>  newcoords_puntosMediosPcoords =  
>>> matrix(rep( c(newcoords,newcoords_1),
>>>
>>> nrow(newPcoords)),nc=2,byrow=TRUE)
>>>
>>>  newcoords = matrix(rep( c(newcoords,newcoords_1),
>>>                     nrow(objeto)),nc=2,byrow=TRUE)
>>>
>>>  objeto = objeto+newcoords
>>>  objetoH = toHiperbolico(objeto, M)
>>>  objetoC = objetoH$objetoC
>>>  objetoP = objetoH$objetoP
>>>
>>>  Pcoords = Pcoords + newcoords_Pcoords
>>>  PcoordsH = toHiperbolico(Pcoords, M)
>>>  PcoordsC = PcoordsH$objetoC
>>>  PcoordsP = PcoordsH$objetoP
>>>
>>>  newPcoords = newPcoords + newcoords_puntosMediosPcoords
>>>  newPcoordsH = toHiperbolico(newPcoords, M)
>>>  Pcoords_objetoC = newPcoordsH$objetoC
>>>
>>>  if (LABELS) {
>>>     PcoordsFI = matrix(toPolar(PcoordsC[,1],PcoordsC[,2]),nc=2)
>>>     PcoordsFI[,2] = 1 +.15
>>>     PcoordsFI = matrix(toCartesian(PcoordsFI[,1],PcoordsFI[, 
>>> 2]),nc=2)
>>>  }
>>>
>>>  plot(circulo(0,0,1, circleCol, PLOT =
>>> FALSE),cex=.5,ylim=c(-1.15,1.15),xlim=c(-1.15,1.15),
>>>                 ann=FALSE, axes=F,type='l', col = circleCol)
>>>
>>>  points(objetoC, pch=19, col = colores, cex = 1.5 - objetoP[,2])
>>>
>>>  text(objetoC[,1], objetoC[,2], labels = etiq2, cex = cgnsphrFont -
>>> objetoP[,2],
>>>       pos = 3, col = itemsCol)
>>>
>>>  abline(h = cx, col = 'grey', lty = 'dashed')
>>>  abline(v = cy, col = 'grey', lty = 'dashed')
>>>
>>>
>>>  points(PcoordsC,cex = 2, col = poisCircleCol)
>>>
>>>  lines(Pcoords_objetoC, col = linesCol)
>>>
>>>
>>>
> segments(Pcoords_objetoC[nrow(Pcoords_objetoC), 
> 1],Pcoords_objetoC[nrow(Pcoords_objetoC),2],
>>>           Pcoords_objetoC[1,1],Pcoords_objetoC[1,2], col = linesCol)
>>>
>>>  if (LABELS) {
>>>     text(PcoordsFI[,1],PcoordsFI[,2],toupper(etiq),cex=.75, col =
>>> poisTextCol)
>>>  }
>>>
>>>  if (selected != 1) {
>>>     circulin(0,0, .5, objeto = objetoC)   # probando
>>>  }
>>>
>>>  if (!exists('POI.env')){
>>>     POI.env <<- new.env()
>>>  }
>>>  poiCOPY = POI
>>>  poiCOPY at objeto <- objeto
>>>  poiCOPY at objetoC <- objetoC
>>>  poiCOPY at newPcoords <- newPcoords
>>>  poiCOPY at Pcoords <- Pcoords
>>>  assign('POI',poiCOPY , envir = POI.env)
>>>
>>>  }
>>> )
>>>
>>>
>>> # *strong*VERY*strong* basic kmeans example with 6 clusters and 10
>>> variables
>>> x <- matrix(rnorm(100, mean = 1, sd = .3), ncol = 10)
>>> x <- rbind(x,matrix(rnorm(200, mean = 5, sd = .3), ncol = 10))
>>> x <- rbind(x,matrix(rnorm(100, mean = 10, sd = .3), ncol = 10))
>>> x <- rbind(x,matrix(rnorm(100, mean = 15, sd = .3), ncol = 10))
>>> x <- rbind(x,matrix(rnorm(200, mean = 20, sd = .3), ncol = 10))
>>> x <- rbind(x,matrix(rnorm(100, mean = 25, sd = .3), ncol = 10))
>>>
>>> cl <- kmeans(x, 6, iter.max = 100 ,nstart = 25)
>>>
>>> # *strong*VERY*strong* basic way of reordering cluster output for
>>> better plotting
>>> # here we reorder using just the first cluster
>>> reorder.cl <- as.numeric(names(sort(rank((as.matrix(dist(cl$centers,
>>> diag = T)))[,1]))))
>>> cl$centers <- cl$centers[reorder.cl, ]
>>> cl$size    <- cl$size[reorder.cl]
>>>
>>> # distance matrix between each element and its cluster center
>>> matrizSim = matrix(rep(0, nrow(cl$centers) * nrow(x)), ncol =
>>> nrow(cl$centers))
>>> for (n in 1:nrow(cl$centers)){
>>> for (i in 1:nrow(x)) {
>>>   a = x[i,]
>>>   b = cl$centers[n,]
>>>   matrizSim[[i,n]] = dist(rbind(a,b)) # eucl dist
>>> }
>>> }
>>>
>>> # From dist to similarity (0 - 1)
>>> matrizSim = 1 - (matrizSim / rowSums(matrizSim) )
>>> # exagerate similarity
>>> matrizSim  = matrizSim^3
>>>
>>> # Create POI plot
>>> clusterPOI = new('POI')
>>> clusterPOI at M = 1          # no fisheye distorsion
>>> clusterPOI at matrizSim <- matrizSim
>>> clusterPOI at wordsInQuery <- paste('"',
>>> as.character(round(cl$centers[,1]),2),'"', '
>>> size',as.character(cl$size))
>>> POIcoords(clusterPOI) <- POICalc(clusterPOI
>>> ,length(clusterPOI at wordsInQuery))
>>> clusterPOI at docs <-
>>>
>>> cbind(matrix(seq(1:nrow(clusterPOI at objeto
> ))),matrix(seq(1:nrow(clusterPOI at objeto))))
>>> clusterPOI at colores <- cl$cluster  + 1
>>> clusterPOI at cos.query.docs <- rep(1, length(cl$cluster))
>>> POI.env <<- new.env()
>>> POIPlot(clusterPOI)
>>
>>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list