[R-sig-Geo] Is a simple feature -friendly version of spdep being developed?
Tiernan Martin
tiernanmartin at gmail.com
Fri May 12 20:22:22 CEST 2017
Is anyone thinking about creating an adaptation of the `spdep` package that
expects sf-class inputs and works well in a pipeline?
I understand that there is skepticism about the wisdom of adopting the
“tidyverse” principles throughout the R package ecosystem, and I share the
concern that an over-reliance on any single paradigm could reduce the
resilience and diversity of the system as a whole.
That said, I believe that the enthusiastic adoption of the `sf` package and
the package's connections with widely-used tidyverse packages like `dplyr`
and `ggplot2` may result in increased demand for sf-friendly spatial
analysis tools. As an amateur who recently started using R as my primary
GIS tool, it seems like the tidyverse's preference for dataframes, S3
objects, list columns, and pipeline workflows would be well-suited to the
field of spatial analysis. Are there some fundamental reasons why the
`spdep` tools cannot (or should not) be adapted to the tidyverse "dialect"?
Let me put the question in the context of an actual analysis: in February
2017, the pop culture infovis website The Pudding (https://pudding.cool/)
published an analysis of regional preferences for Oscar-nominated films in
the US (https://pudding.cool/2017/02/oscars_so_mapped/). A few days ago,
the author posted a tutorial explaining the method of “regional smoothing”
used to create the article’s choropleths (
https://pudding.cool/process/regional_smoothing/).
The method relies on several `spdep` functions (
https://github.com/polygraph-cool/smoothing_tutorial/blob/master/smoothing_tutorial.R).
In the code below, I provide reprex with a smaller dataset included in the
`sf` package:
library(sf)
library(spdep)
nc <- st_read(system.file("shape/nc.shp", package = "sf")) # North
Carolina counties
nc_shp <- as(nc,'Spatial')
coords <- coordinates(nc_shp)
IDs<-row.names(as(nc_shp, "data.frame"))
knn5 <- knn2nb(knearneigh(coords, k = 5), row.names = IDs) # find the
nearest neighbors for each county
knn5 <- include.self(knn5)
localGvalues <- localG(x = as.numeric(nc_shp at data$NWBIR74), listw =
nb2listw(knn5, style = "B"), zero.policy = TRUE) # calculate the G scores
localGvalues <- round(localGvalues,3)
nc_shp at data$LOCAL_G <- as.numeric(localGvalues)
p1 <- spplot(nc_shp, c('NWBIR74'))
p2 <- spplot(nc_shp, c('LOCAL_G'))
plot(p1, split=c(1,1,2,2), more=TRUE)
plot(p2, split=c(1,2,2,2), more=TRUE)
Here’s what I imagine that would look like in a tidyverse pipeline (please
note that this code is for illustrative purposes and will not run):
library(tidyverse)
library(purrr)
library(sf)
library(sfdep) # this package doesn't exist (yet)
nc <- st_read(system.file("shape/nc.shp", package = "sf"))
nc_g <-
nc %>%
mutate(KNN = map(.x = geometry, ~ sfdep::st_knn(.x, k = 5, include.self =
TRUE)), # find the nearest neighbors for each county
NB_LIST = map(.x = KNN, ~ sfdep::st_nb_list(.x, style = 'B')), #
make a list of the neighbors using the binary method
LOCAL_G = sfdep::st_localG(x = NWBIR74, listw = NB_LIST,
zero.policy = TRUE), # calculate the G scores
LOCAL_G = round(LOCAL_G,3))
We can see that the (hypothetical) tidyverse version reduces the amount of
intermediate objects and wraps the creation of the G scores into a single
code chunk with clear steps.
I'd be grateful to hear from the users and developers of the `spdep` and
`sf` packages about this topic!
Tiernan Martin
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list