[R-sig-Geo] SAR Poisson GLM model
Roger.Bivand at nhh.no
Mon Feb 1 11:15:28 CET 2016
On Mon, 1 Feb 2016, Clément Gorin wrote:
> I am estimating a gravity model of migration on cross-sectional data.
> The Moran I statistic indicates a positive and significant spatial
> autocorrelation in the residuals of the a-spatail model, and the
> Lagrange Multiplier test points to the Spatial Autoregressive (SAR)
> model as the preferred specification. While I have no issue fitting a
> linear SAR (Le Sage and Pace 2008) to my data, it does not accommodate
> the very large number of zeroes (> 90%) in my dependent variable. This
> clearly point to a Poisson process (Santos Silva and Tenreyro 2006).
> In short, I am having trouble running the SAR Poisson GLM. I had two
> (1) Is there a method to run a SAR Poisson GLM in R? (I searched a lot
> before posting here)
Look at your model first. A gravity model estimated as Poisson implies the
278k are interactions, that is origin/destination pairs, not counts of
origins or destinations. sqrt(278784) is 528, which I think is your real
n. Your data are zero-inflated, so maybe you need to use a zero-inflated
approach. Just saying you want to run a SAR Poisson (where your use of SAR
is ambiguous - you mean y ~ rho W y + X beta, but SAR really means
simultaneous autoregressive, a distinction from conditional autoregressive
(CAR)), suggests autocorrelation in the interactions, but the spatial
autocorrelation is likely in the origin and destination fixed effects, not
in the interactions.
Did you look at the spatial regression section of the Spatial task view?
There is a SAR Poisson approach (in your terms) in INLA - the slm latent
model does something like this, but will not handle the zero-inflation,
and most likely isn't appropriate to your setting. In any case, a Poisson
approach without an offset (log expected interactions) may not be
sensible, in addition to the spatial autocorrelation actually "belonging
to" the origins and destinations, not the interactions.
> (2) If answer to (1) is no, I should at least use a spatially filtered
> Poisson GLM. Yet, both SptatialFiltering() and ME() crash even using a
> very simple connectivity structure (symmetric knn = 5). I mean that it
> did not give any message error but RStudio simply "lost the connection
> with the R session". I suspect this is due to the large number of
> observation (278 784). Do you have tips to increase computational
Never run sensitive processes under RStudio. They have not yet (tens of
months' waiting) replied to a query of this kind, and the problem in some
cases may be with them. In a former case under Windows, a similar message
was seen under RStudio, but the underlying command (not yours) ran to
completion in RGui. Always report the output of sessionInfo() - your
platform is unknown.
Why would you think that a dense 278k x 278k matrix could be handled?
SF/ME work by selecting eigenvectors from the weights (knn is pretty
unsatisfactory too - it does not yield a planar graph). Your initial
memory needs are roughly 600GB, so this is probably not the way to go.
If the earlier comments are correct, you actually have n=528, meaning that
you'd need first to align the SF processes with the origin and destination
fixed effects (or adding WX to the X) - see work by Griffith and Chun -
Yongwan Chun may even have code for DOI: 10.1080/00045608.2011.561070 and
other publications. IIRC you roll out the eigenvectors to the nxn
interactions like other fixed effects.
Hope this clarifies,
> Clément Gorin
> PhD student, GATE LSE
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo