[R-sig-Geo] How to efficiently generate data of neighboring points

Lom Navanyo |omn@v@@|@ @end|ng |rom gm@||@com
Fri Jun 5 10:12:06 CEST 2020


I fully agree with you and appreciate the listed benefits of not taking
things private. I was just trying to be sure the forum here is appropriate
and receptive of a beginner like me.

To be more explicit with regards to my observations, y is amount of water
withdrawal from wells and an important variable in x is (height of) water
level in the wells. These are end of year figures. I am using the
aggregations (sum for y and mean for water level) by band as spatial
neighborhood variables. There will be one or two indicator variables also
in x. I hope these do not
present additional hurdles.

 I am thinking Proximity is relevant in testing spatial
dependency/externality.

I will consider splm package  and the SLX model.

Thank you.
---------------
Lom

On Thu, Jun 4, 2020 at 2:52 PM Roger Bivand <Roger.Bivand using nhh.no> wrote:

> On Thu, 4 Jun 2020, Lom Navanyo wrote:
>
> > Thank you. Yes, the OLS is biased and my plan is to use a 2SLS approach.
> I
> > have a variable I intend to use as an IV for y.
> > I have seen a few papers use this approach. Will this approach not
> correct
> > for the endogeneity?
> >
> > Actually, I am not sure if this is a right forum or perhaps if it's
> > appropriate or acceptable to you to take this one-on-one with you for
> help:
>
> I do not offer private help. That would presuppose that one person has the
> answer. It would also presuppose that all exchanges are only read by the
> original poster and direct participants, while in fact others may join in,
> or follow a thread, or find the thread by searching: google supports the
> list:r-sig-geo search tag. If the thread goes private, that search is
> fruitless.
>
> > My model actually looks like this: y= f(y, x)  + e.
> > Aside the endogeneity of y (which I intend to instrument by another
> > variable z), there is simultaneity between y and x.
> > I intend to use the lag of x as instrument for x.  Given that I am
> seeking
> > to test spatial dependency, do you see some fatal flaws with my approach?
> >
>
> What is the support of your observations, point, or are they aggregations?
> Why may proximity make a difference - often, apparent spatial
> autocorrelation is caused by observing inappropriate entities, or by
> omitting covariates, or by using the wrong functional form.
>
>
> > I have also seen other empirical approaches like static and dynamic
> spatial
> > panel data modelling. I will be reviewing them also to see suitability
> for
> > my objective.
> > But, any further directions or suggestions are highly appreciated.
>
> If the data are spatial panel, you can look at the splm package.
> Personally, I have never found instruments any use at all, because the
> instruments are typically at best weak because of shared spatial processes
> with the response, unless the model is really well specified from known
> theory. In space, almost everything is close to endogeneous unless the
> opposite is demonstrated. So causal relationships are less worthwhile,
> because they are at best conditional on omitted variables and
> autocorrelation engendered by the choice of observational entities.
>
> Further, because spatial processes are driven by the inverse matrix of the
> input graph of proximate neighbours (the covariance matrix of the spatial
> process), you don't need to start from more than the first order
> neighbours. Maybe your x has the same spatial pattern as y, so that the
> residuals are white noise with no spatial structure.
>
> Recently, analysts prefer to start with the SLX model (Halleck Vega &
> Elhorst 2015 and others), so that might be worth exploring. If only the
> direct impacts seem important, OLS may be enough.
>
> Hope this helps,
>
> Roger
>
> >
> > Thanks,
> > -------------------
> > Lom
> >
> >
> >
> > On Thu, Jun 4, 2020 at 3:48 AM Roger Bivand <Roger.Bivand using nhh.no> wrote:
> >
> >> On Thu, 4 Jun 2020, Lom Navanyo wrote:
> >>
> >>> Thank you very much for your support. This gives me what I need and I
> >> must
> >>> say listw2sn() is really great.
> >>>
> >>> Why do I need the data in the format as in dataout? I am trying to test
> >>> spatial dependence (or neighborhood effect) by running a regression
> >>> model that entails pop_size_it = beta_1*sum of pop_size of point i's
> >>> neighbors within a specified radius. So my plan is to get the neighbors
> >>> for each focal point as per the specified bands and their attributes
> (eg
> >>> pop_size) so I can can add them (attribute) by the bands.
> >>
> >> Thanks, clarifies a good deal. Maybe look at the original localG
> articles
> >> for exploring distance relationships (Getis and Ord looked at HIV/AIDS);
> >> ?spdep::localG or
> https://r-spatial.github.io/spdep/reference/localG.html.
> >>
> >> Further note at OLS is biased as you have y = f(y) + e, so y on both
> >> sides. The nearest equivalent for a single band is
> spatialreg::lagsarlm()
> >> with listw=nb2listw(wd1, style="B") to get the neighbour sums through
> the
> >> weights matrix. So both your betas and their standard errors are
> unusable,
> >> I'm afraid. You are actually very much closer to ordinary kriging,
> looking
> >> at the way in which distance attenuates the correlation in value of
> >> proximate observations.
> >>
> >> Hope this clarifies,
> >>
> >> Roger
> >>
> >>>
> >>> I am totally new to the area of spatial econometrics, so I am taking
> >> things
> >>> one step at a time. Some readings suggest I may need distance matrix or
> >>> weight matrix but for now I think I should try the current approach.
> >>>
> >>> Thank you.
> >>>
> >>> -------------
> >>> Lom
> >>>
> >>> On Wed, Jun 3, 2020 at 8:18 AM Roger Bivand <Roger.Bivand using nhh.no>
> wrote:
> >>>
> >>>> On Wed, 3 Jun 2020, Lom Navanyo wrote:
> >>>>
> >>>>> I had the errors with rtree using R 3.6.3. I have since changed to R
> >>>> 4.0.0
> >>>>> but I got the same error.
> >>>>>
> >>>>> And  yes, for Roger's example, I have the objects wd1, ... wd4, all
> >> with
> >>>>> length 101. I think my difficulty is my inability to output the list
> >>>>> detailing the point IDs t50_fid.
> >>>>
> >>>> library(spData)
> >>>> library(sf)
> >>>> projdata<-st_transform(nz_height, 32759)
> >>>> pts <- st_coordinates(projdata)
> >>>> library(spdep)
> >>>> bufferR <- c(402.336, 1609.34, 3218.69, 4828.03, 6437.38)
> >>>> bds <- c(0, bufferR)
> >>>> wd1 <- dnearneigh(pts, bds[1], bds[2])
> >>>> wd2 <- dnearneigh(pts, bds[2], bds[3])
> >>>> wd3 <- dnearneigh(pts, bds[3], bds[4])
> >>>> wd4 <- dnearneigh(pts, bds[4], bds[5])
> >>>> sn_band1 <- listw2sn(nb2listw(wd1, style="B", zero.policy=TRUE))
> >>>> sn_band1$band <- paste(attr(wd1, "distances"), collapse="-")
> >>>> sn_band2 <- listw2sn(nb2listw(wd2, style="B", zero.policy=TRUE))
> >>>> sn_band2$band <- paste(attr(wd2, "distances"), collapse="-")
> >>>> sn_band3 <- listw2sn(nb2listw(wd3, style="B", zero.policy=TRUE))
> >>>> sn_band3$band <- paste(attr(wd3, "distances"), collapse="-")
> >>>> sn_band4 <- listw2sn(nb2listw(wd4, style="B", zero.policy=TRUE))
> >>>> sn_band4$band <- paste(attr(wd4, "distances"), collapse="-")
> >>>> data_out <- do.call("rbind", list(sn_band1, sn_band2, sn_band3,
> >> sn_band4))
> >>>> class(data_out) <- "data.frame"
> >>>> table(data_out$band)
> >>>> data_out$ID_from <- projdata$t50_fid[data_out$from]
> >>>> data_out$ID_to <- projdata$t50_fid[data_out$to]
> >>>> data_out$elev_from <- projdata$elevation[data_out$from]
> >>>> data_out$elev_to <- projdata$elevation[data_out$to]
> >>>> str(data_out)
> >>>>
> >>>> The "spatial.neighbour" representation was that used in the S-Plus
> >>>> SpatialStats module, with "from" and "to" columns, and here drops
> >>>> no-neighbour cases gracefully. So listw2sn() comes in useful
> >>>> for creating the output, and from there, just look-up in the
> >>>> input data.frame. Observations here cannot be their own neighbours.
> >>>>
> >>>> It would be relevant to know why you need these, are you looking at
> >>>> variogram clouds?
> >>>>
> >>>> Hope this clarifies,
> >>>>
> >>>> Roger
> >>>>
> >>>>>
> >>>>> ---------
> >>>>> Lom
> >>>>>
> >>>>> On Tue, Jun 2, 2020 at 8:02 PM Kent Johnson <kent3737 using gmail.com>
> >> wrote:
> >>>>>
> >>>>>> Roger's example works for me and gives a list of length 101. I did
> >> have
> >>>>>> some issues that were resolved by updating packages. I'm using R
> 3.6.3
> >>>> on
> >>>>>> macOS 10.15.4. I also use rtree successfully on Windows 10 with R
> >> 3.6.3.
> >>>>>>
> >>>>>> Kent
> >>>>>>
> >>>>>> On Tue, Jun 2, 2020 at 12:29 PM Roger Bivand <Roger.Bivand using nhh.no>
> >>>> wrote:
> >>>>>>
> >>>>>>> On Tue, 2 Jun 2020, Kent Johnson wrote:
> >>>>>>>
> >>>>>>>> rtree uses Euclidean distance so the points should be in a
> >> coordinate
> >>>>>>>> system where this makes sense at least as a reasonable
> >> approximation.
> >>>>>>>
> >>>>>>> I tried the original example:
> >>>>>>>
> >>>>>>> remotes::install_github("hunzikp/rtree")
> >>>>>>> library(spData)
> >>>>>>> library(sf)
> >>>>>>> projdata<-st_transform(nz_height, 32759)
> >>>>>>> library(rtree)
> >>>>>>> pts <- st_coordinates(projdata)
> >>>>>>> rt <- RTree(st_coordinates(projdata))
> >>>>>>> bufferR <- c(402.336, 1609.34, 3218.69, 4828.03, 6437.38)
> >>>>>>> wd1 <- withinDistance(rt, pts, bufferR[1])
> >>>>>>>
> >>>>>>> but unfortunately failed (maybe newer Boost headers than yours?):
> >>>>>>>
> >>>>>>> Error in UseMethod("withinDistance", rTree) :
> >>>>>>>    no applicable method for 'withinDistance' applied to an object
> of
> >>>>>>> class
> >>>>>>> "c('list', 'RTree')"
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Kent
> >>>>>>>>
> >>>>>>>> On Tue, Jun 2, 2020 at 9:59 AM Roger Bivand <Roger.Bivand using nhh.no>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On Tue, 2 Jun 2020, Kent Johnson wrote:
> >>>>>>>>>
> >>>>>>>>>>> Date: Tue, 2 Jun 2020 02:44:17 -0500
> >>>>>>>>>>> From: Lom Navanyo <lomnavasia using gmail.com>
> >>>>>>>>>>> To: r-sig-geo using r-project.org
> >>>>>>>>>>> Subject: [R-sig-Geo] How to efficiently generate data of
> >>>> neighboring
> >>>>>>>>>>>         points within specified radii (distances) for each
> point
> >>>> in a
> >>>>>>>>> given
> >>>>>>>>>>>         points data set.
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Hello,
> >>>>>>>>>>> I have data set of about 3400 location points with which I am
> >>>> trying
> >>>>>>> to
> >>>>>>>>>>> generate data of each point and their neighbors within defined
> >>>> radii
> >>>>>>>>> (eg,
> >>>>>>>>>>> 0.25, 1, and 3 miles).
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> The rtree package is very fast and memory-efficient for
> >>>>>>> within-distance
> >>>>>>>>>> calculations.
> >>>>>>>>>> https://github.com/hunzikp/rtree
> >>>>>>>>>
> >>>>>>>>> Thanks! Does this also apply when the input points are in
> >>>> geographical
> >>>>>>>>> coordinates?
> >>>>>>>>>
> >>>>>>>>> Roger
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Kent Johnson
> >>>>>>>>>> Cambridge, MA
> >>>>>>>>>>
> >>>>>>>>>>       [[alternative HTML version deleted]]
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> R-sig-Geo mailing list
> >>>>>>>>>> R-sig-Geo using r-project.org
> >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Roger Bivand
> >>>>>>>>> Department of Economics, Norwegian School of Economics,
> >>>>>>>>> Helleveien 30, N-5045 Bergen, Norway.
> >>>>>>>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> >>>>>>>>> https://orcid.org/0000-0003-2392-6140
> >>>>>>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Roger Bivand
> >>>>>>> Department of Economics, Norwegian School of Economics,
> >>>>>>> Helleveien 30, N-5045 Bergen, Norway.
> >>>>>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> >>>>>>> https://orcid.org/0000-0003-2392-6140
> >>>>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Roger Bivand
> >>>> Department of Economics, Norwegian School of Economics,
> >>>> Helleveien 30, N-5045 Bergen, Norway.
> >>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> >>>> https://orcid.org/0000-0003-2392-6140
> >>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> >>>>
> >>>
> >>
> >> --
> >> Roger Bivand
> >> Department of Economics, Norwegian School of Economics,
> >> Helleveien 30, N-5045 Bergen, Norway.
> >> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> >> https://orcid.org/0000-0003-2392-6140
> >> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> >>
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list