[R-sig-Geo] Is a simple feature -friendly version of spdep being developed?

Edzer Pebesma edzer.pebesma at uni-muenster.de
Wed May 17 12:41:06 CEST 2017


Interesting discussion, thanks for bringing this up here. Some ideas:

I agree that there's more than one answer to the question what is tidy
in the context of analysing spatial data. Compared to the S4 classes in
sp, with the simplicity of S3 and data.frame's, there's also a certain
risk/loss, two examples came up the last few days in [1] and [2].

I don't think that the risks of adding neighbourhood lists or sparse
weights matrices to sf/data.frame objects are really large - right now
I'd guess (untried!) it is also possible to create nb lists, then subset
data and this list or sparse matrix, and end up with a meaningless mess
relatively unnoticed (some luck needed).

It might be a good idea to give these lists a certain class (right now
`st_intersects` or `st_rook` return an unclassed list), and modify the
`filter.sf` method such that it will warn when these objects are
subsetted - in which case they would loose their meaning.

I'm not exactly sure how to pack a sparse weights matrix in a
list-column, but that can't be hard; somehow I feel that sparse matrices
from Matrix would not fit in, since they're not lists - convert on the fly?

[1] https://github.com/edzer/sfr/issues/343
[2] https://github.com/edzer/sfr/issues/340

On 16/05/17 14:27, Michael Sumner wrote:
> Hello, I'm following this conversation with great interest. There's a lot
> here for me
> to work with, it's an active pursuit and I don't have much to offer yet but
> I have a couple of inline responses below.  I've learnt a lot about spdep
> from this discussion, something I've meant to explore for a long time
> but my work has never had the same modelling focus).
> 
> On Tue, 16 May 2017 at 18:25 Roger Bivand <Roger.Bivand at nhh.no> wrote:
> 
>> On Tue, 16 May 2017, Tiernan Martin wrote:
>>
>>>> Should the weights be
>>>> squirreled away inside the object flowing down the pipe?
>>>
>>> I'd propose that the that functions be written in such a way that the
>>> neighbors, weights, and test results can be stored in list cols within
>> the
>>> data.frame.
>>>
>>> This way multiple tests (and their inputs) can be stored in a single,
>>> easily comprehensible, rectangular data object.
>>>
>>> Here's some more pretend code to illustrate that concept with regard to
>> sf:
>>>
>>>  nc %>%
>>>    group_by(NAME) %>%
>>>    nest %>%
>>>    mutate(
>>>           geometry = map(.x = data, ~.x[['geometry']]),
>>>           neighbors = map(.x = data, .f = my_nb), # creates a list col of
>>> the neighbors
>>>           weights = map(.x = data, .f = add_weights), # creates a weights
>>> object (class=wt, is this a sparse matrix in spdep?)
>>>           test_moran = map2(.x = data, .y =  weights, .f = my_moran), #
>>> creates list of Moran's I and sample kurtosis of x
>>>           test_geary = map2(.x = data, .y = weights, .f = my_geary) #
>>> creates list of Geary's C and sample kurtosis of x
>>>               )
>>>
>>> #> # A tibble: 100 x 7
>>> #>           NAME              data         geometry  neighbors  weights
>>> test_moran test_geary
>>> #>        <fctr>            <list>           <list>     <list>   <list>
>>> <list>     <list>
>>> #>  1        Ashe <tibble [1 x 14]> <simple_feature> <list [2]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  2   Alleghany <tibble [1 x 14]> <simple_feature> <list [5]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  3       Surry <tibble [1 x 14]> <simple_feature> <list [4]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  4   Currituck <tibble [1 x 14]> <simple_feature> <list [2]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  5 Northampton <tibble [1 x 14]> <simple_feature> <list [3]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  6    Hertford <tibble [1 x 14]> <simple_feature> <list [6]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  7      Camden <tibble [1 x 14]> <simple_feature> <list [5]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  8       Gates <tibble [1 x 14]> <simple_feature> <list [2]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #>  9      Warren <tibble [1 x 14]> <simple_feature> <list [5]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #> 10      Stokes <tibble [1 x 14]> <simple_feature> <list [2]> <S3: wt>
>>> <list [2]> <list [2]>
>>> #> # ... with 90 more rows
>>>
>>> Examples of this nested data.frame / list col approach can be found here:
>>> http://r4ds.had.co.nz/many-models.html#nested-data
>>
>> Thanks, interesting but really scary. You get numeric output for local
>> Moran's I and local Geary, but what will happen next?
>>
>> All of these tests assume a correctly specified mean model. Without this
>> assumption being met (absence of trends, absence of global spatial
>> autocorrelation in the local test case, absence of missing right-hand-side
>> variables and absence of incorrect functional forms for these, ...), a map
>> pretending to invite "inference" of "hotspots" etc. - collections of
>> observations with similar values of the local statistic, will most often
>> be highly misleading. This isn't only about whether it is possible to do
>> technically, but also about not making it simpler than reason would
>> suggest. With a correct mean model, there may not be any (local) spatial
>> autocorrelation left. See ?localmoran.sad and ?localmoran.exact, and their
>> references, and McMillen (2003) (the McSpatial author). I won't even
>> mention the need to adjust p.values for multiple comparisons ...
>>
>>>
>>>> Extending this to adding weights to sf objects may be going too far. It
>>>> seems slightly forced to - say - attach a sparse matrix to an sf
>> geometry
>>>> column as an attribute, or to add a neighbour nested list column when
>>>> writing subsetting protection is very far from obvious. Like time series
>>>> data (not time stamped observations, but real ordered time series),
>>>> spatial data have implicit order that is tidy in a different way than
>> tidy
>>>> (graph dependencies between observation rows). Maybe mapped features
>> with
>>>> attached attribute data are "tidier" than a table, because they would
>>>> preserve their relative positions?
>>>
>>> This seems like the crux of the problem. Are you saying that the
>> orthogonal
>>> "tidyness" of the tidyverse is fundamentally different from the
>> underlying
>>> relationships of spatial data, and therefore there's a limit to the
>>> usefulness of tidyverse approach to spatial analysis?
>>
>> Right, this is central and open. Spatial objects can be tidy in the
>> tidyverse sense - sf shows this can be done. The open question is whether
>> this extends to the analysis of the data.
> 
> 
> 
> I think the structures are still wide open, unless you think the goals of sf
> encompass everything. The simple features standard is one corner of the
> available ways to structure spatial data. It pointedly ignores relations
> between objects, and heavily relies on 2D-specific optimizations. All
> the topology is done "on the fly" and is treated as expendable. I personally
> want structures that treat simple features and similar formats as
> expendable,
> but that maintain the topology as a first-class citizen. I don't say this
> because I think simple features is not important, it's just not my
> main focus. I want time, depth, temperature, salinity, anything and
> everything
> storable on features, parts, edges, vertices and nodes. Lots of tools
> do this or aspects of it, and none of them can be shoe-horned into simple
> features, and
> they don't have a unifying framework. (It's called "mesh structures", or
> the simplicial
> complex, or "a database" in other contexts, but none of those are really
> enough).
> 
>  (I'm dismayed by GIS ideas when they are spoken about as if it's the end
> of the story. Sf is really great, but it's now more than the standard and
> it's kind
> of unique in the available tools for the standard itself - but most of them
> are pretty specialized and different to each other, probably unavoidably).
> 
> ggplot2 and tidygraph still have a long way to go, but in there I believe
> is a language not just of graphics and of converting to and from igraph,
> but of flexible ways of specifying how spatial data are constructed,
> analysed
> and interacted with.
> Ggplot builds geometries, but it has no output other than a plot, and the
> object that can
> modify that plot. Select, group-by, arrange, filter, nest, and normalization
> into multiple tables are all part of the powerful ways in which raw data
> can be given structure. "Spatial" is just one small part of that structure,
> and
> unfortunately the 'geographic spatial', the 2D optimizations baked in
> by decades old GIS practice seems to have the most sway in discussions.
> 
> This community could find a way to bake-in the geometries from ggplot
> constructions as sf objects, i.e. convert gg-plot into sf, not into a plot.
> I think that's a
> digestable project that would really provide some valuable insights.
> 
> 
>> From looking around, it seems
>> that time series and graphs suffer from the same kinds of questions, and
>> it will be interesting to see what stars turns up for spatio-temporal
>> data:
>>
>> https://github.com/edzer/stars
>>
>> I see that stars wisely does not include trip/trajectory data structures,
>> but analysis might involve reading the array-based representation of
>> spatio-temporal environmental drivers for an ensemble of buffers around
>> trajectories (say animal movement), then modelling.
>>
>>
> Interesting that you mention this, I agree it's a crux area in modern
> spatial,
> tracking has been neglected for years but it's starting to be looked at.
> 
>  I don't see this as terribly difficult, in fact we do this routinely
> for all kinds of tracking data both as-measured and modelled. The
> difficulty is having
> access to the data as well as the motivation in the same place. We have
> in-house tools
> that sit on large collections of remotely sensed time series data:
> 
> https://github.com/AustralianAntarcticDataCentre/raadsync
> 
> When you have both the motivation (lots of animal tracking and voyage data)
> and
> the data on hand it's relatively easy (we have a ready audience at in
> Southern Ocean
> ecosystems research here).  First step is to write a read-data tool as a
> function
> of time. Then the problem of matching tracking data to the right space-time
> window is easily broken down into the intervals between the time
> series of environmental data. At that level, you can employ interpolation
> between the layers,
> aggregation to compromise resolution with coverage, and even
> bring user-defined calculations to the smallish windows to calculate
> derived products (rugosity, slope, distance to thresholds etc.)
> 
> In our modelled track outputs with tripEstimation/SGAT and bsam[1] we can
> use the more probabilistic
> estimates of location as more nuanced "query buckets" against the
> same environmental data, it's the same issue with matching time with
> all the different options you might want. What is hard for us is to share
> the tools,
> because first you need to arrange access to quite a lot of data, before
> you get to see the magic in action.
> 
> I do think that stars provides a great opportunity though, and my thoughts
> along
> those lines are here:
> 
> https://github.com/mdsumner/stars.use.cases/blob/master/Use-cases.rmd#animal-tracking-mcmc-estimation
> 
> 
> [1] https://CRAN.R-project.org/package=bsam
> 
> 
> 
>> Maybe: "everything should be as tidy as it can be but not tidier"?
>>
>>
> I don't think we've even started with the housekeeping here. Some early
> thoughts are here:
> 
> http://rpubs.com/cyclemumner/sc-rationale
> 
> I'm interested to help find ways to nest the spdep idioms in "tidy" ways,
> but I'm also not sure
> how valuable that is yet. When you convert the structures to primitives the
> edge and vertex relations
> are inherent already, and it's a mater of standard database
> join/select/filter idioms
> to link things up:
> 
> http://rpubs.com/cyclemumner/neighbours01
> 
> http://rpubs.com/cyclemumner/neighbours02
> 
> (those examples use https://github.com/mdsumner/scsf for the PRIMITIVE
> model
> but it's really too raw to use yet). This is all fine and dandy, but then
> you still have a list-bag
> - i.e. database  - of tables with no strong idioms for treating them as a
> single object -
> that's the real loftier goal  of tidygraph, it's certainly a clear goal of
> the tidyverse to be a
> general master of data).
> 
> Is this a serious option worth pursuing for spdep? I don't know, but I'm
> pursuing it for my own
> reasons as detailed above. I don't think modernizing spdep would be that
> difficult.  At the least the limitations and
> obstacles faced when using it  should be described.
> 
> I'm happy to help and I might have a go at it at some point, and I'd
> encourage
> anyone to get in and try it out.With rlang and the imminent release of the
> greatly
> improved dplyr it's a good time to start.
> 
> Cheers, Mike.
> 
> 
> 
>> Roger
>>
>>>
>>> On Mon, May 15, 2017 at 5:17 AM Roger Bivand <Roger.Bivand at nhh.no>
>> wrote:
>>>
>>>> On Sun, 14 May 2017, Tiernan Martin wrote:
>>>>
>>>>> You’re right about one thing: those of us introduced to the pipe in our
>>>>> formative years find it hard to escape (everything is better here
>> inside
>>>>> the pipe - right?) ;)
>>>>
>>>> Pipes can be great if they don't involve extra inputs at a later stage
>> in
>>>> the pipe, but as:
>>>>
>>>> https://github.com/thomasp85/tidygraph
>>>>
>>>> suggests, it isn't very easy, and may be forced.
>>>>
>>>>>
>>>>> Thank you for your very informative response. Before reading it, I had
>> a
>>>>> vague inkling that matrices were part of the challenge of adapting
>> these
>>>>> tools to work with simple features, but I also thought perhaps all this
>>>>> issue needed was a little nudge from a user like me. Your response made
>>>> it
>>>>> clear that there is a series of technical decisions that need to be
>> made,
>>>>> and in order to do that I imagine some exploratory testing is in order.
>>>> Of
>>>>> course, it is possible that the data.frame + list-col foundation is
>> just
>>>>> not well-suited to the needs of spatial weighting/testing tools, but I
>>>>> suspect that there is some valuable insight to be gained from exploring
>>>>> this possibility further. And as I stated in my first post, going
>>>> forward I
>>>>> expect more users will be interested in seeing this sort of integration
>>>>> happen.
>>>>
>>>> I think that there are two tasks - one to create neighbour objects from
>> sf
>>>> objects, then another to see whether the costs of putting the spatial
>>>> weights into a data.frame (or even more challenging, a tibble) to test
>> for
>>>> spatial autocorrelation or model spatial dependence. Should the weights
>> be
>>>> squirreled away inside the object flowing down the pipe?
>>>>
>>>> sf_object %>% add_weights(...) -> sf_object_with_weights
>>>> moran.test(sf_object_with_weights, ...)
>>>> geary.test(sf_object_with_weights, ...)
>>>>
>>>> or
>>>>
>>>> sf_object %>% add_weights(...) %>% moran.test(sf_object_with_weights,
>> ...)
>>>> -> output
>>>>
>>>> all in one (output is a htest object). It gets more complex in:
>>>>
>>>> lm.morantest(lm(formula, sf_object), sf_object_with_weights)
>>>>
>>>> as we need the weights and the lm output object.
>>>>
>>>>>
>>>>>> How should we go about this technically? spdep is on R-Forge and is
>>>> happy
>>>>>> there. Its present functionality has to be maintained as it stands, it
>>>>> has
>>>>>> too many reverse dependencies to break. Should it be re-written from
>>>>>> scratch (with more sparse matrices internally for computation)?
>>>>>> ...
>>>>>> We'd need proof of concept with realistically sized data sets (not yet
>>>> NY
>>>>>> taxis, but maybe later ...). spdep started as spweights, sptests and
>>>>>> spdep, and the first two got folded into the third when stable. If
>>>>> weights
>>>>>> are the first thing to go for, sfweights is where to go first (and
>> port
>>>>>> the STRtree envelope intersections for contiguities). It could define
>>>> the
>>>>>> new classes, and tests and modelling would use them.
>>>>>
>>>>> Sounds to me like this package would need to be built from the ground
>> up,
>>>>> possibly following a similar path to the `sp` development process as
>> you
>>>>> mentioned. Maybe that presents a challenge with regards to resources
>>>> (i.e.,
>>>>> funding, time, etc.). Perhaps project this is a candidate for future
>>>>> proposals to the R Consortium or other funding sources. I hope that
>> this
>>>>> post is useful in documenting user interest in seeing the `sf` protocol
>>>>> integrated into other spatial tools and workflows, and I look forward
>> to
>>>>> hearing others' thoughts on the matter.
>>>>
>>>> Making spatial weights from sf objects is just expanding spdep, and is
>>>> feasible. There is no point looking for funding, all it needs is
>> knowledge
>>>> of how things have been done in spdep. Contiguity, knn, distance,
>>>> graph-based neighbours are all feasible.
>>>>
>>>> Extending this to adding weights to sf objects may be going too far. It
>>>> seems slightly forced to - say - attach a sparse matrix to an sf
>> geometry
>>>> column as an attribute, or to add a neighbour nested list column when
>>>> writing subsetting protection is very far from obvious. Like time series
>>>> data (not time stamped observations, but real ordered time series),
>>>> spatial data have implicit order that is tidy in a different way than
>> tidy
>>>> (graph dependencies between observation rows). Maybe mapped features
>> with
>>>> attached attribute data are "tidier" than a table, because they would
>>>> preserve their relative positions?
>>>>
>>>> Roger
>>>>
>>>>>
>>>>> Thanks again,
>>>>>
>>>>> Tiernan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, May 13, 2017 at 1:39 PM Roger Bivand <Roger.Bivand at nhh.no>
>>>> wrote:
>>>>>
>>>>>> On Fri, 12 May 2017, Tiernan Martin wrote:
>>>>>>
>>>>>>> Is anyone thinking about creating an adaptation of the `spdep`
>> package
>>>>>> that
>>>>>>> expects sf-class inputs and works well in a pipeline?
>>>>>>
>>>>>> I assume that "this is not a pipeline" is a joke for insiders (in the
>>>>>> pipe)?
>>>>>>
>>>>>> No, there is your issue on the sfr github repository that is relevant
>>>> for
>>>>>> contiguous neighbours, but not beyond that:
>>>>>>
>>>>>> https://github.com/edzer/sfr/issues/234
>>>>>>
>>>>>> An sf is a data.frame, and as such should "just work", like "Spatial"
>>>>>> objects have, in formula/data settings. The problem is (of course) the
>>>>>> weights matrix.
>>>>>>
>>>>>> Should it be a list column (each row has a list nesting two lists,
>> first
>>>>>> indices, second non-zero weights), or remain separate as it has been
>> for
>>>>>> 20 years, or become a column-oriented representation (Matrix package)
>> -
>>>> a
>>>>>> nested list like a list column or a listw obeject is row-oriented. I
>> had
>>>>>> started thinking about using a sparse column-oriented representation,
>>>> but
>>>>>> have not implemented functions accepting them instead of listw
>> objects.
>>>>>>
>>>>>> I am very cautious about creating classes for data that were
>> data.frame,
>>>>>> then sf, and then have the weights built-in. In the simple case it
>> would
>>>>>> work, but all you have to do is re-order the rows and the link between
>>>> the
>>>>>> neighbour ids and row order breaks down; the same applies to
>> subsetting.
>>>>>>
>>>>>> The problems to solve first are related the workflows, and easiest to
>>>> look
>>>>>> at in the univariate case (Moran, Geary, join-count, Mantel, G, ...)
>> for
>>>>>> global and local tests. I think that all or almost all of the NSE
>> verbs
>>>>>> will cause chaos (mutate, select, group) once weights have been
>> created.
>>>>>> If there is a way to bind the neighour IDs to the input sf object
>> rows,
>>>> a
>>>>>> list column might be possible, but we have to permit multiple such
>>>> columns
>>>>>> (Queen, Rook, ...), and ensure that subsetting and row-reordering keep
>>>> the
>>>>>> graph relations intact (and their modifications, like
>>>> row-standardisation)
>>>>>>
>>>>>> If you've seen any examples of how tidy relates to graphs (adjacency
>>>> lists
>>>>>> are like nb objects), we could learn from that.
>>>>>>
>>>>>> How should we go about this technically? spdep is on R-Forge and is
>>>> happy
>>>>>> there. Its present functionality has to be maintained as it stands, it
>>>> has
>>>>>> too many reverse dependencies to break. Should it be re-written from
>>>>>> scratch (with more sparse matrices internally for computation)?
>>>>>>
>>>>>> Your example creates weights on the fly for a local G* map, but has
>>>> n=100,
>>>>>> not say n=90000 (LA census blocks). Using the sf_ geos based methods
>>>> does
>>>>>> not use STRtrees, which cut the time needed to find contigious
>>>> neighbours
>>>>>> from days to seconds. We ought to pre-compute weights, but this messes
>>>> up
>>>>>> the flow of data, because a second lot of data (the weights) have to
>>>> enter
>>>>>> the pipe, and be matched with the row-ids.
>>>>>>
>>>>>> We'd need proof of concept with realistically sized data sets (not yet
>>>> NY
>>>>>> taxis, but maybe later ...). spdep started as spweights, sptests and
>>>>>> spdep, and the first two got folded into the third when stable. If
>>>> weights
>>>>>> are the first thing to go for, sfweights is where to go first (and
>> port
>>>>>> the STRtree envelope intersections for contiguities). It could define
>>>> the
>>>>>> new classes, and tests and modelling would use them.
>>>>>>
>>>>>> Thanks for starting this discussion.
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>>>
>>>>>>> I understand that there is skepticism about the wisdom of adopting
>> the
>>>>>>> “tidyverse” principles throughout the R package ecosystem, and I
>> share
>>>>>> the
>>>>>>> concern that an over-reliance on any single paradigm could reduce the
>>>>>>> resilience and diversity of the system as a whole.
>>>>>>>
>>>>>>> That said, I believe that the enthusiastic adoption of the `sf`
>> package
>>>>>> and
>>>>>>> the package's connections with widely-used tidyverse packages like
>>>>>> `dplyr`
>>>>>>> and `ggplot2` may result in increased demand for sf-friendly spatial
>>>>>>> analysis tools. As an amateur who recently started using R as my
>>>> primary
>>>>>>> GIS tool, it seems like the tidyverse's preference for dataframes, S3
>>>>>>> objects, list columns, and pipeline workflows would be well-suited to
>>>> the
>>>>>>> field of spatial analysis. Are there some fundamental reasons why the
>>>>>>> `spdep` tools cannot (or should not) be adapted to the tidyverse
>>>>>> "dialect"?
>>>>>>>
>>>>>>> Let me put the question in the context of an actual analysis: in
>>>> February
>>>>>>> 2017, the pop culture infovis website The Pudding (
>>>> https://pudding.cool/
>>>>>> )
>>>>>>> published an analysis of regional preferences for Oscar-nominated
>> films
>>>>>> in
>>>>>>> the US (https://pudding.cool/2017/02/oscars_so_mapped/). A few days
>>>> ago,
>>>>>>> the author posted a tutorial explaining the method of “regional
>>>>>> smoothing”
>>>>>>> used to create the article’s choropleths (
>>>>>>> https://pudding.cool/process/regional_smoothing/).
>>>>>>>
>>>>>>> The method relies on several `spdep` functions (
>>>>>>>
>>>>>>
>>>>
>> https://github.com/polygraph-cool/smoothing_tutorial/blob/master/smoothing_tutorial.R
>>>>>> ).
>>>>>>> In the code below, I provide reprex with a smaller dataset included
>> in
>>>>>> the
>>>>>>> `sf` package:
>>>>>>>
>>>>>>> library(sf)
>>>>>>> library(spdep)
>>>>>>>
>>>>>>> nc <- st_read(system.file("shape/nc.shp", package = "sf"))  # North
>>>>>>> Carolina counties
>>>>>>> nc_shp <- as(nc,'Spatial')
>>>>>>>
>>>>>>> coords <- coordinates(nc_shp)
>>>>>>> IDs<-row.names(as(nc_shp, "data.frame"))
>>>>>>>
>>>>>>> knn5 <- knn2nb(knearneigh(coords, k = 5), row.names = IDs)  # find
>> the
>>>>>>> nearest neighbors for each county
>>>>>>> knn5 <- include.self(knn5)
>>>>>>>
>>>>>>> localGvalues <- localG(x = as.numeric(nc_shp at data$NWBIR74), listw =
>>>>>>> nb2listw(knn5, style = "B"), zero.policy = TRUE) # calculate the G
>>>> scores
>>>>>>> localGvalues <- round(localGvalues,3)
>>>>>>>
>>>>>>> nc_shp at data$LOCAL_G <- as.numeric(localGvalues)
>>>>>>>
>>>>>>> p1 <- spplot(nc_shp, c('NWBIR74'))
>>>>>>> p2 <- spplot(nc_shp, c('LOCAL_G'))
>>>>>>> plot(p1, split=c(1,1,2,2), more=TRUE)
>>>>>>> plot(p2, split=c(1,2,2,2), more=TRUE)
>>>>>>>
>>>>>>> Here’s what I imagine that would look like in a tidyverse pipeline
>>>>>> (please
>>>>>>> note that this code is for illustrative purposes and will not run):
>>>>>>>
>>>>>>> library(tidyverse)
>>>>>>> library(purrr)
>>>>>>> library(sf)
>>>>>>> library(sfdep) # this package doesn't exist (yet)
>>>>>>>
>>>>>>> nc <- st_read(system.file("shape/nc.shp", package = "sf"))
>>>>>>>
>>>>>>> nc_g <-
>>>>>>>  nc %>%
>>>>>>>  mutate(KNN = map(.x = geometry, ~ sfdep::st_knn(.x, k = 5,
>>>> include.self
>>>>>> =
>>>>>>> TRUE)),  # find the nearest neighbors for each county
>>>>>>>         NB_LIST = map(.x = KNN, ~ sfdep::st_nb_list(.x, style =
>>>> 'B')),  #
>>>>>>> make a list of the neighbors using the binary method
>>>>>>>         LOCAL_G = sfdep::st_localG(x = NWBIR74, listw = NB_LIST,
>>>>>>> zero.policy = TRUE),  # calculate the G scores
>>>>>>>         LOCAL_G = round(LOCAL_G,3))
>>>>>>>
>>>>>>> We can see that the (hypothetical) tidyverse version reduces the
>> amount
>>>>>> of
>>>>>>> intermediate objects and wraps the creation of the G scores into a
>>>> single
>>>>>>> code chunk with clear steps.
>>>>>>>
>>>>>>> I'd be grateful to hear from the users and developers of the `spdep`
>>>> and
>>>>>>> `sf` packages about this topic!
>>>>>>>
>>>>>>> Tiernan Martin
>>>>>>>
>>>>>>>       [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>
>>>>>> --
>>>>>> Roger Bivand
>>>>>> Department of Economics, Norwegian School of Economics,
>>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>>> voice: +47 55 95 93 55 <+47%2055%2095%2093%2055>
>> <+47%2055%2095%2093%2055>
>>>> <+47%2055%2095%2093%2055>; e-mail:
>>>>>> Roger.Bivand at nhh.no
>>>>>> Editor-in-Chief of The R Journal,
>>>> https://journal.r-project.org/index.html
>>>>>> http://orcid.org/0000-0003-2392-6140
>>>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Department of Economics, Norwegian School of Economics,
>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>> voice: +47 55 95 93 55 <+47%2055%2095%2093%2055>
>> <+47%2055%2095%2093%2055>; e-mail:
>>>> Roger.Bivand at nhh.no
>>>> Editor-in-Chief of The R Journal,
>> https://journal.r-project.org/index.html
>>>> http://orcid.org/0000-0003-2392-6140
>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55 <+47%2055%2095%2093%2055>; e-mail:
>> Roger.Bivand at nhh.no
>> Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
>> http://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 

-- 
Edzer Pebesma
Institute for Geoinformatics  (ifgi),  University of Münster
Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
Journal of Statistical Software:   http://www.jstatsoft.org/
Computers & Geosciences:   http://elsevier.com/locate/cageo/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20170517/3ff0d738/attachment.sig>


More information about the R-sig-Geo mailing list