[R-sig-Geo] spdep: new zero.policy attribute

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Thu Nov 9 12:52:24 CET 2023


Thanks, Josiah and Connor!

We are moving the discussion to: https://github.com/gpiras/sphet/issues/17 for those who can participate there - if someone with comments prefers not to write there, please continue to follow up here.

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
Roger.Bivand using nhh.no

________________________________________
From: Josiah Parry <josiah.parry using gmail.com>
Sent: 05 November 2023 18:01
To: Roger Bivand
Cc: r-sig-geo using r-project.org
Subject: Re: [R-sig-Geo] spdep: new zero.policy attribute

You don't often get email from josiah.parry using gmail.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
My take is generally that less is more. In the case of an isolated node, I think the best thing to do is to return NA rather than 0. For example consider administrative boundaries where the lagged variable is median household income. If we returned a 0 in that case, we'd likely be introducing quite the outlier!

My preference would be to _always_ default to NA lagged values when no neighbors are present. Additionally, it would be quite nice to instruct users on how to impute these values with other lagged variables. Say we have an isolated node, but we don't want an NA value. We can impute that missing value with the spatial lag of the variable using a different neighborhood construction—e.g. using KNN with k = 3 to ensure that the node always has values to lag.

Here is an example gist imputing k=3 lag https://gist.github.com/JosiahParry/eb7878fc375fb931ddd6675a2c591a2b<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FJosiahParry%2Feb7878fc375fb931ddd6675a2c591a2b&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=K8weAXIlvFdTkHIlsVnJSgTbwG%2F2GNOn57XTtu9rMpQ%3D&reserved=0>

Another option could be to use the focal feature's value as the lag itself. If a location has no neighborhood could we argue that the location is its own neighborhood?

I'm not too familiar with the models to comment on them, though. I do see this more as a pre-processing / data imputation issue. I suspect that's more of a machine learning paradigm though!


On Sun, Nov 5, 2023 at 10:31 AM Roger Bivand <Roger.Bivand using nhh.no<mailto:Roger.Bivand using nhh.no>> wrote:
And a question: in nb2listw() and similar functions creating spatial weights listw objects, would it be sensible to guess that the presence of no-neighbour observations in the input nb neighbour implies the choice of a spatially lagged value of zero (zero.policy=TRUE), lx = Wx, rather than NA (zero.policy=FALSE)?

That is, use by default zero.policy=any(card(nb) == 0L) rather than zero.policy=NULL and look in the spdep option set by default on package load to FALSE but settable by the user?

Would this be taking trying to be helpful too far, given that the analyst is creating the neighbour object and presumably should take responsibility for choices made?

Context: polygons not sharing boundaries with other polygons do exist legitimately in data sources, but setting spatially lagged values to zero for those polygons is quite an invasive imputation. It may be better to oblige the user to make the choice when the spatial weights listw object is created.

Little is known about the problem, for a recent treatment for CAR models see: https://arxiv.org/abs/1705.04854<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fabs%2F1705.04854&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GzEc02zEBhB8Y9D6Xy7To%2FX2qjCF5p9EzxeEjmVF6a8%3D&reserved=0>, published as https://doi.org/10.1016/j.sste.2018.04.002<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1016%2Fj.sste.2018.04.002&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=i9DpuNPqlxtc%2ByPoiR2dUAV4uCRo33La9TxYVqAg838%3D&reserved=0>, where: "The specification of a CAR model on a disconnected graph is undefined ... [t]here are essentially two types of disconnected graphs: first, a graph containing an island (a singleton node with no neighbours), second, a graph split in different sub-graphs (each of them being a connected graph)".

This question concerns the former, singleton, case, but adding sub-graph counts if greater than unity to summary.nb and print.nb address the second . Very possibly, functions creating nb neighbour objects should themselves report that an output object (graph) is not connected, bigDM CARBayes CARBayesST geostan spatialreg stampr do call spdep::n.comp.nb themselves to check the subgraph count.

Interested in feedback,

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
Roger.Bivand using nhh.no<mailto:Roger.Bivand using nhh.no>

________________________________________
From: R-sig-Geo <r-sig-geo-bounces using r-project.org<mailto:r-sig-geo-bounces using r-project.org>> on behalf of Roger Bivand <Roger.Bivand using nhh.no<mailto:Roger.Bivand using nhh.no>>
Sent: 04 November 2023 18:53
To: r-sig-geo using r-project.org<mailto:r-sig-geo using r-project.org>
Subject: [R-sig-Geo] spdep: new zero.policy attribute

In forthcoming spdep 1.3-1, spatial weight listw objects get a new zero.policy attribute. The attribute is added as objects are created to record the status of the zero.policy argument in the function creating the object, see: https://github.com/r-spatial/spdep/commit/e159de922c61713529a4075b0dfc2966eb8f9ad6<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fr-spatial%2Fspdep%2Fcommit%2Fe159de922c61713529a4075b0dfc2966eb8f9ad6&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fZnjmNrcQKjLU3%2BDdz%2BagbRLj%2BnosBV7lapfcZwOiks%3D&reserved=0>.

Reverse dependency checks only show problems from over-eager unit testing in SpatialFeatureExperiment, a Bioconductor package, but other workflows may be impacted. The new attribute is used in tests for spatial autocorrelation to set the zero.policy argument in those tests (the arguments were zero.policy=NULL, are now zero.policy=attr(listw, "zero.policy") where listw is the spatial weights object argument to the test function.

This will be extended to spatialreg and friends if nobody reports negative impacts here soon. I'll wait before releasing 1.3-1 for a few days to see if any feedback is forthcoming.

Hope this long-overdue change is helpful,

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
Roger.Bivand using nhh.no<mailto:Roger.Bivand using nhh.no>
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo using r-project.org<mailto:R-sig-Geo using r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZGKctkwXrNZulRgleZh3QDfeLHWz4sTMcO9%2FTUMKMI%3D&reserved=0>

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo using r-project.org<mailto:R-sig-Geo using r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZGKctkwXrNZulRgleZh3QDfeLHWz4sTMcO9%2FTUMKMI%3D&reserved=0>


More information about the R-sig-Geo mailing list