[R-sig-Geo] spdep: new zero.policy attribute

Josiah Parry jo@|@h@p@rry @end|ng |rom gm@||@com
Sun Nov 5 18:01:38 CET 2023


My take is generally that less is more. In the case of an isolated node, I
think the best thing to do is to return NA rather than 0. For example
consider administrative boundaries where the lagged variable is median
household income. If we returned a 0 in that case, we'd likely be
introducing quite the outlier!

My preference would be to _always_ default to NA lagged values when no
neighbors are present. Additionally, it would be quite nice to instruct
users on how to impute these values with other lagged variables. Say we
have an isolated node, but we don't want an NA value. We can impute that
missing value with the spatial lag of the variable using a different
neighborhood construction—e.g. using KNN with k = 3 to *ensure* that the
node always has values to lag.

Here is an example gist imputing k=3 lag
https://gist.github.com/JosiahParry/eb7878fc375fb931ddd6675a2c591a2b

Another option could be to use the focal feature's value *as* the lag
itself. If a location has no neighborhood could we argue that the location *is
*its own neighborhood?

I'm not too familiar with the models to comment on them, though. I do see
this more as a pre-processing / data imputation issue. I suspect that's
more of a machine learning paradigm though!


On Sun, Nov 5, 2023 at 10:31 AM Roger Bivand <Roger.Bivand using nhh.no> wrote:

> And a question: in nb2listw() and similar functions creating spatial
> weights listw objects, would it be sensible to guess that the presence of
> no-neighbour observations in the input nb neighbour implies the choice of a
> spatially lagged value of zero (zero.policy=TRUE), lx = Wx, rather than NA
> (zero.policy=FALSE)?
>
> That is, use by default zero.policy=any(card(nb) == 0L) rather than
> zero.policy=NULL and look in the spdep option set by default on package
> load to FALSE but settable by the user?
>
> Would this be taking trying to be helpful too far, given that the analyst
> is creating the neighbour object and presumably should take responsibility
> for choices made?
>
> Context: polygons not sharing boundaries with other polygons do exist
> legitimately in data sources, but setting spatially lagged values to zero
> for those polygons is quite an invasive imputation. It may be better to
> oblige the user to make the choice when the spatial weights listw object is
> created.
>
> Little is known about the problem, for a recent treatment for CAR models
> see: https://arxiv.org/abs/1705.04854, published as
> https://doi.org/10.1016/j.sste.2018.04.002, where: "The specification of
> a CAR model on a disconnected graph is undefined ... [t]here are
> essentially two types of disconnected graphs: first, a graph containing an
> island (a singleton node with no neighbours), second, a graph split in
> different sub-graphs (each of them being a connected graph)".
>
> This question concerns the former, singleton, case, but adding sub-graph
> counts if greater than unity to summary.nb and print.nb address the second
> . Very possibly, functions creating nb neighbour objects should themselves
> report that an output object (graph) is not connected, bigDM CARBayes
> CARBayesST geostan spatialreg stampr do call spdep::n.comp.nb themselves to
> check the subgraph count.
>
> Interested in feedback,
>
> Roger
>
> --
> Roger Bivand
> Emeritus Professor
> Norwegian School of Economics
> Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
> Roger.Bivand using nhh.no
>
> ________________________________________
> From: R-sig-Geo <r-sig-geo-bounces using r-project.org> on behalf of Roger
> Bivand <Roger.Bivand using nhh.no>
> Sent: 04 November 2023 18:53
> To: r-sig-geo using r-project.org
> Subject: [R-sig-Geo] spdep: new zero.policy attribute
>
> In forthcoming spdep 1.3-1, spatial weight listw objects get a new
> zero.policy attribute. The attribute is added as objects are created to
> record the status of the zero.policy argument in the function creating the
> object, see:
> https://github.com/r-spatial/spdep/commit/e159de922c61713529a4075b0dfc2966eb8f9ad6
> .
>
> Reverse dependency checks only show problems from over-eager unit testing
> in SpatialFeatureExperiment, a Bioconductor package, but other workflows
> may be impacted. The new attribute is used in tests for spatial
> autocorrelation to set the zero.policy argument in those tests (the
> arguments were zero.policy=NULL, are now zero.policy=attr(listw,
> "zero.policy") where listw is the spatial weights object argument to the
> test function.
>
> This will be extended to spatialreg and friends if nobody reports negative
> impacts here soon. I'll wait before releasing 1.3-1 for a few days to see
> if any feedback is forthcoming.
>
> Hope this long-overdue change is helpful,
>
> Roger
>
> --
> Roger Bivand
> Emeritus Professor
> Norwegian School of Economics
> Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
> Roger.Bivand using nhh.no
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list