[R-sig-Geo] Natural Breaks Classification

Hisaji ONO hi_ono2001 at ybb.ne.jp
Fri Mar 3 18:04:05 CET 2006


Hi.

 As you might know, most of Jenks' optimization method
depends on Fischer's "EXACT OPTIMIZATION" METHOD in
(Fisher, W. D., 1958, On grouping for maximum homogeneity.
Journal of the American Statistical Association, 53, 789
・98.).

 This source code is available from following CMU's
statlib site in fortran code.

 http://lib.stat.cmu.edu/cmlib/src/cluster/fish.f

 Jenks' one is available in following paper media.
Probably its in Basic.

 Jenks, G. F. (1977). Optimal data classification for
choropleth maps, Occasional paper No. 2. Lawrence, Kansas:
University of Kansas, Department of Geography. 

 I've ported above code into
Geotools-lite(http://sourceforge.net/project/showfiles.php?group_id=4091)
in Java by modified by J. Macgill.

 Most of Jenks'
code(uk.ac.leeds.ccg.geotools.classification.NaturalBreaks.java)
as follows.

 /**
     * @return int[]
     * @param list com.sun.java.util.collections.ArrayList
     * @param numclass int
     */
    public int[] getJenksBreaks(ArrayList list, int
numclass) {
        
        
        //int numclass;
        int numdata = list.size();
        
        
        double[][] mat1 = new double[numdata + 1][numclass
+ 1];
        double[][] mat2 = new double[numdata + 1][numclass
+ 1];
        double[] st = new double[numdata];
        
        
        for (int i = 1; i <= numclass; i++) {
            mat1[1][i] = 1;
            mat2[1][i] = 0;
            for (int j = 2; j <= numdata; j++)
                mat2[j][i] = Double.MAX_VALUE;
        }
        double v = 0;
        for (int l = 2; l <= numdata; l++) {
            double s1 = 0;
            double s2 = 0;
            double w = 0;
            for (int m = 1; m <= l; m++) {
                int i3 = l - m + 1;

                double val = ((Double)list.get(i3
-1)).doubleValue();
                
                
                s2 += val * val;
                s1 += val;
                
                
                w++;
                v = s2 - (s1 * s1) / w;
                int i4 = i3 - 1;
                if (i4 != 0) {
                    for (int j = 2; j <= numclass; j++) {
                        if (mat2[l][j] >= (v + mat2[i4][j
- 1])) {
                            mat1[l][j] = i3;
                            mat2[l][j] = v + mat2[i4][j -
1];

                        };
                    };
                };
            };
            mat1[l][1] = 1;
            mat2[l][1] = v;
        };
        int k = numdata;
        
        
        int[] kclass = new int[numclass];
        
        
        kclass[numclass - 1] = list.size() - 1;
        
        
        for (int j = numclass; j >= 2; j--) {
            System.out.println("rank = " + mat1[k][j]);
            int id =  (int) (mat1[k][j]) - 2;
            System.out.println("val = " + list.get(id));
            //System.out.println(mat2[k][j]);
            
            
            kclass[j - 2] = id;
            
            
            k = (int) mat1[k][j] - 1;
            
            
        };
        return kclass;
    }
    
    class doubleComp implements Comparator {
    public int compare(Object a, Object b) {
     if (((Double) a).doubleValue() < ((Double)
b).doubleValue())
      return -1;
     if (((Double) a).doubleValue() > ((Double)
b).doubleValue())
      return 1;
     return 0;
    }
    }

 Regards.




More information about the R-sig-Geo mailing list