[R-sig-Geo] Kernel Density Estimator Help

Alex Fitz afitz at email.wm.edu
Mon Jun 13 18:05:59 CEST 2016


Hi,

I’m running across an issue with my code while trying to use a Kernel Density Estimator to cluster points. I am working with my data in two different ways trying to find the most optimal.  First, I have my data in the form of a matrix. An example of this is below, and I have my latitude and longitude in my code attached to the columns and rows in the matrix.  

m <- c(
  c(8.83,8.89,8.81,8.87,8.9,8.87),
  c(8.89,8.94,8.85,8.94,8.96,8.92),
  c(8.84,8.9,8.82,8.92,8.93,8.91),
  c(8.79,8.85,8.79,8.9,8.94,8.92),
  c(8.79,8.88,8.81,8.9,8.95,8.92),
  c(8.8,8.82,8.78,8.91,8.94,8.92),
  c(8.75,8.78,8.77,8.91,8.95,8.92),
  c(8.8,8.8,8.77,8.91,8.95,8.94),
  c(8.74,8.81,8.76,8.93,8.98,8.99),
  c(8.89,8.99,8.92,9.1,9.13,9.11),
  c(8.97,8.97,8.91,9.09,9.11,9.11),
  c(9.04,9.08,9.05,9.25,9.28,9.27),
  c(9,9.01,9,9.2,9.23,9.2),
  c(8.99,8.99,8.98,9.18,9.2,9.19),
  c(8.93,8.97,8.97,9.18,9.2,9.18)
)
dim(m) <- c(15,6)

I also have my data in a data table where column 1 is my latitude, column 2 is my longitude, and column 3 is the value.

z <- c(
  c(8.83,8.89, 2),
  c(8.89,8.94, 4),
  c(8.84,8.9, 1),
  c(8.79,8.852, 4),
  c(8.79,8.88, 5),
  c(8.8,8.82, 2),
  c(8.75,8.78, 1),
  c(8.8,8.8, 2),
  c(8.74,8.81, 7),
  c(8.89,8.99, 1),
  c(8.97,8.97, 6),
  c(9.04,9.08, 8),
  c(9,9.01, 1),
  c(8.99,8.99, 8),
  c(8.93,8.97, 2)
)
dim(z) <- c(15,3)

The actual data I am using is from larger rasters and shapefiles. 
The raster is fromhttp://beta.sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count/data-download <fromhttp://beta.sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count/data-download>. 
And the shapefiles are from  http://www.gadm.org/download <http://www.gadm.org/download>  — I am using Nigeria.

The main question of this post is clustering and the optimal data format for clustering functions.  I currently have all of the grid points of the entire country with their (Lat, Long, Value).  I want to run a Kernel Density Estimator across all of the points and then cluster based on certain values.  Looking at the pdfCluster package it seems to do just that except i’m not sure how to allow it to accept (lat/long) values and run across a geographic plane.  Since my data is across a geographic area and isn’t completely continuous i’m running in to errors.  Any hints for how to modify the pdfCluster package for accepting such values or what dataset is best to use would be greatly appreciated.

Thanks,
Alex


	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list