[R-sig-Geo] SAR Poisson GLM model

Roger Bivand Roger.Bivand at nhh.no
Mon Feb 1 11:15:28 CET 2016

On Mon, 1 Feb 2016, Clément Gorin wrote:

> Hi,

> I am estimating a gravity model of migration on cross-sectional data. 
> The Moran I statistic indicates a positive and significant spatial 
> autocorrelation in the residuals of the a-spatail model, and the 
> Lagrange Multiplier test points to the Spatial Autoregressive (SAR) 
> model as the preferred specification. While I have no issue fitting a 
> linear SAR (Le Sage and Pace 2008) to my data, it does not accommodate 
> the very large number of zeroes (> 90%) in my dependent variable. This 
> clearly point to a Poisson process (Santos Silva and Tenreyro 2006).
> In short, I am having trouble running the SAR Poisson GLM. I had two 
> questions:
> (1) Is there a method to run a SAR Poisson GLM in R? (I searched a lot 
> before posting here)

Look at your model first. A gravity model estimated as Poisson implies the 
278k are interactions, that is origin/destination pairs, not counts of 
origins or destinations. sqrt(278784) is 528, which I think is your real 
n. Your data are zero-inflated, so maybe you need to use a zero-inflated 
approach. Just saying you want to run a SAR Poisson (where your use of SAR 
is ambiguous - you mean y ~ rho W y + X beta, but SAR really means 
simultaneous autoregressive, a distinction from conditional autoregressive 
(CAR)), suggests autocorrelation in the interactions, but the spatial 
autocorrelation is likely in the origin and destination fixed effects, not 
in the interactions.

Did you look at the spatial regression section of the Spatial task view?

There is a SAR Poisson approach (in your terms) in INLA - the slm latent 
model does something like this, but will not handle the zero-inflation, 
and most likely isn't appropriate to your setting. In any case, a Poisson 
approach without an offset (log expected interactions) may not be 
sensible, in addition to the spatial autocorrelation actually "belonging 
to" the origins and destinations, not the interactions.

> (2) If answer to (1) is no, I should at least use a spatially filtered 
> Poisson GLM. Yet, both SptatialFiltering() and ME() crash even using a 
> very simple connectivity structure (symmetric knn = 5). I mean that it 
> did not give any message error but RStudio simply "lost the connection 
> with the R session". I suspect this is due to the large number of 
> observation (278 784). Do you have tips to increase computational 
> efficiency?

Never run sensitive processes under RStudio. They have not yet (tens of 
months' waiting) replied to a query of this kind, and the problem in some 
cases may be with them. In a former case under Windows, a similar message 
was seen under RStudio, but the underlying command (not yours) ran to 
completion in RGui. Always report the output of sessionInfo() - your 
platform is unknown.

Why would you think that a dense 278k x 278k matrix could be handled? 
SF/ME work by selecting eigenvectors from the weights (knn is pretty 
unsatisfactory too - it does not yield a planar graph). Your initial 
memory needs are roughly 600GB, so this is probably not the way to go.

If the earlier comments are correct, you actually have n=528, meaning that 
you'd need first to align the SF processes with the origin and destination 
fixed effects (or adding WX to the X) - see work by Griffith and Chun - 
Yongwan Chun may even have code for DOI: 10.1080/00045608.2011.561070 and 
other publications. IIRC you roll out the eigenvectors to the nxn 
interactions like other fixed effects.

Hope this clarifies,


> Best,
> Clément Gorin
> PhD student, GATE LSE
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no

More information about the R-sig-Geo mailing list