[R-sig-Geo] Simulating spatially autocorrelated data

Roger Bivand Roger.Bivand at nhh.no
Wed Sep 7 21:30:39 CEST 2011


On Wed, 7 Sep 2011, Terry Griffin wrote:

> Patrick,
>
> Specification of the spatial weights matrix (W) is important, and, in 
> general, the connectedness of the W influences the estimation and 
> inference of the model. When you say that you do not know the "true 
> rho", I suspect you are saying that you do not know the true underlying 
> spatial structure of the data, and thus the appropriate specification of 
> the spatial weights matrix. One tool in the spdep package that may be 
> helpful to you is the sp.correlogram function for spatial correlogram; 
> other techniques have been used including semivariograms. I would be 
> interested in what others have to say regarding determining the optimal 
> level of connectedness of W.
>
> Two classic references regarding connectedness of W are:
>
> Florax, R.J.G.M. and Rey, S. 1995 The Impacts of Misspecified Spatial 
> Interaction in Linear Regression Models. In Anselin L, Florax R J G M 
> (eds) New directions in spatial econometrics. Berlin, Springer: 111-135
>
> Bell, K.P. and Bockstael, N.E. 2000. Applying the Generalized-Moment 
> Estimation Approach to Spatial Problems Involving Micro level Data. The 
> Review of Economics and Statistics, February 2000, 82 (1): 72-82.

And a newer one:

Smith, T. E. (2009), Estimation Bias in Spatial Models with Strongly 
Connected Weight Matrices. Geographical Analysis, 41: 307–332. doi: 
10.1111/j.1538-4632.2009.00758.x

Sparse connections are present in the spatial weights, but the implied 
spatial process is dense, because the inverse of (I - \rho W) is dense. If 
the weights are strongly connected, the assumed process probably imples
"oversmoothing".

Roger

>
> Terry Griffin, Ph.D.
> Associate Professor - Economics
> University of Arkansas - Division of Agriculture
> 501.249.6360 (SMS)
> tgriffin at uaex.edu
>
>
>
>
> ----- Original Message -----
> From: "Patrick Downey" <PDowney at urban.org>
> To: "Roger Bivand" <Roger.Bivand at nhh.no>
> Cc: r-sig-geo at stat.math.ethz.ch
> Sent: Tuesday, September 6, 2011 1:02:04 PM
> Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data
>
> Hi Roger and Terry,
>
> Thank you very much for your help and directing me towards Roger's spdep
> package, which of course had everything I needed. I've now worked through
> this code and done some additional simulations.
>
> I have one remaining question. You say "the larger the distance threshold,
> the less well the spatial process is captured." I was wondering if you
> could further provide some information on this, either by explaining or
> referencing a document or webpage with explanation.
>
> Decreasing the distance threshold, as you suggest, radically alters the
> results and I'm looking for some guidance on how to select the appropriate
> distance threshold when I don't know the true rho (that is, with
> non-simulated data).
>
> Thanks,
> Mitch
>
>
> -----Original Message-----
> From: Roger Bivand [mailto:Roger.Bivand at nhh.no]
> Sent: Thursday, September 01, 2011 2:20 PM
> To: Downey, Patrick
> Cc: r-sig-geo at stat.math.ethz.ch
> Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data
>
> On Thu, 1 Sep 2011, Downey, Patrick wrote:
>
>> Hello all,
>>
>> I'm trying to simulate a spatially autocorrelated random variable, and
>> I cannot figure out what the problem is. All I want is a simple
>> spatial lag model where
>>
>> Y = rho*W*Y + e
>>
>> Where e is a vector of iid normal random variables, rho is the
>> autocorrelation, W is a row-normalized distance matrix (a spatial
>> weights matrix), and Y is the random variable.
>>
>> I thought the following program should do it, but it's not working. At
>> the end of the program, I calculate Moran's I, and it is not even
>> close to rejecting the null hypothesis of no spatial autocorrelation,
>> even when rho is very high (for example, below, rho is 0.95). Can
>> someone please identify what the problem is and offer some guidance on
> how to fix it?
>>
>> PS - I apologize in advance, but I am not familiar with R's spatial
>> packages. I've done very little spatial analysis in R, so if there's a
>> package that can already do this, please recommend.
>>
>> BEGIN PROGRAM:
>>
>> install.packages("fields");library(fields)
>> install.packages("ape");library(ape)
>>
>> N <- 200
>> rho <- 0.95
>>
>> x.coord <- runif(N,0,100)
>> y.coord <- runif(N,0,100)
>>
>> points <- cbind(x.coord,y.coord)
>>
>> e <- rnorm(N,0,1)
>>
>> dist.nonnorm <- rdist(points,points)   # Matrix of Euclidean distances
>> dist <- dist.nonnorm/rowSums(dist.nonnorm)   # Row normalizing the
> distance
>> matrix
>> diag(dist) <- 0   # Ensuring that the main diagonal is exactly 0
>
> I think that you are using the distances as weights, not inverse distances,
> which seems more sensible.
>
>>
>> I <- diag(N)   # Identity matrix (not Moran's I)
>>
>> inv <- solve(I-rho.lag*dist)   # Inverting (I - rho*W)
>> y <- as.vector(inv %*% e)   # Generating data that is supposed to be
>> spatially autocorrelated
>>
>> Moran.I(y,dist)   # Does not reject null hypothesis of no spatial
>> autocorrelation
>>
>
> As Terry Griffin says, you can use spdep for this:
>
> library(spdep)
> rho <- 0.95
> N <- 200
> x.coord <- runif(N,0,100)
> y.coord <- runif(N,0,100)
> points <- cbind(x.coord,y.coord)
> e <- rnorm(N,0,1)
> dnb <- dnearneigh(points, 0, 150)
> dsts <- nbdists(dnb, points)
> idw <- lapply(dsts, function(x) 1/x)
> lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv
> %*% e moran.test(y, lw)
>
> to reproduce your analysis with IDW, here without:
>
> lw <- nb2listw(dnb, glist=dsts, style="W") inv <- invIrW(lw, rho) y <- inv
> %*% e moran.test(y, lw) # no autocorrelation
>
> and here with a less inclusive distance threshold:
>
> dnb <- dnearneigh(points, 0, 15)
> dsts <- nbdists(dnb, points)
> idw <- lapply(dsts, function(x) 1/x)
> lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv
> %*% e moran.test(y, lw)
>
>
> the larger the distance threshold, the less well the spatial process is
> captured, alternatively use idw <- lapply(dsts, function(x) 1/(x^2)), for
> example, to attenuate the weights more sharply.
>
> Hope this clarifies,
>
> Roger
>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> --
> Roger Bivand
> Department of Economics, NHH Norwegian School of Economics, Helleveien 30,
> N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; fax +47 55 95 95 43
> e-mail: Roger.Bivand at nhh.no
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list