[R-sig-Geo] classInt::classIntervals only for very small data sets?

Jochen Albrecht jochen at hunter.cuny.edu
Fri Feb 26 21:14:08 CET 2010


I am trying to do a natural breaks (Jenks) classification on a data set 
with some 300,000 observations. I started with
     salescat=classIntervals(sales, 100, style="jenks")
and cut this process off after it ran for 12 hours. Then I tried it with 
just ten classes but this made no difference.
With a subset of just 300 observations, it runs for 36 seconds.
For 1,000 records, it runs seven minutes and then throws the following 
error:
     Error in if (mat2[l, j] >= (v + mat2[i4, j - 1])) { :
       missing value where TRUE/FALSE needed
     In addition: Warning message:
     In val * val : NAs produced by integer overflow
I checked the data interactively; there are no missing or non-integer 
values.
ArcMap classifies it instantaneously without hiccups and takes about ten 
seconds for all 300,000 records (though limiting itself to a maximum of 
32 classes).
Do you have any suggestions as to what I am doing wrong or what could be 
done to resolve my problem?
Cheers,
     Jochen



More information about the R-sig-Geo mailing list