[R-sig-Geo] Spatial clustering gridded data with missing values (over water)
ACD
andrewcd at gmail.com
Tue Nov 10 16:13:00 CET 2015
I'm trying to cluster a spatial dataset, and use the cluster labels as
an input to a second process. I've been using the `spdep` package in R.
I've got gridded data at .5 degree lat/lon resolution. There are 19
covariates in the example subset of it linked here:
https://www.dropbox.com/s/i72na4k0k5gqvvx/example_data?dl=0
The following shows that I can't calculate the minimum spanning tree --
a necessary input into `skater` -- when the dataset includes areas
undefined because they are over water.
How would one get around this?
system('wget
https://www.dropbox.com/s/i72na4k0k5gqvvx/example_data?dl=0')
load('example_data?dl=0')
1> with(x, plot(lon,lat))
1> library(spdep)
1> bh.nb <-
cell2nb(length(unique(x$lon)),length(unique(x$lat)),torus=F,type='queen')
1> lcosts <- nbcosts(nb = bh.nb, data = x,method='euclidean')
Error in data[id.neigh, , drop = FALSE] : subscript out of bounds
If I restrict the data to cut out the missing values, I have no problem:
x = x[x$lon>-117,]
bh.nb <-
cell2nb(length(unique(x$lon)),length(unique(x$lat)),torus=F,type='queen')
lcosts <- nbcosts(nb = bh.nb, data = x,method='euclidean')
nb.w <- nb2listw(bh.nb, lcosts, style="B")
mst.bh <- mstree(nb.w,10)
res1 <- skater(mst.bh[,1:2], x, 5)
plot(res1, cbind(x$lon,x$lat), cex.circles=0.035, cex.lab=.7)
How do I get around this over-water problem? I want to be able to
cluster the land surfaces, including islands and peninsulas. I suppose
that I want islands to be linked to their nearest point of land.
References appreciated as well as fixes to the specific problem with the
`spdep` interface.
Thanks!
Andrew
More information about the R-sig-Geo
mailing list