[R-sig-Geo] Question spdep package - lagsarlm not terminating

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Sun May 5 15:21:12 CEST 2019


On Sat, 4 May 2019, Raphael Mesaric via R-sig-Geo wrote:

> Dear all,
>
> I have a question with regards to the function „lagsarlm" from the 
> package spdep. My problem is that the function is not terminating. Of 
> course, I have quite a big grid (depending on the selection either 34000 
> or 60600 entries) and I also have a lot of explanatory variables (about 
> 40). But I am still wondering whether there is something wrong.

If you read the help page (the function is in the spatialreg package and 
will be dropped from spdep shortly), and look at the references (Bivand et 
al. 2013), you will see that the default value if the method= argument is 
"eigen". For small numbers of observations, solving the eigenproblem of a 
dense weights object is not a problem, but becomes demanding on memory as 
n increases. You are using virual memory (and may run out of that too) 
which makes your machine unresponsive. If you choose an alternative 
method, typically "Matrix" for symmetric or similar to symmetric sparse 
weights, or "LU" for asymmetric sparse weights.

> data(house, package="spData")
> dim(house)
[1] 25357    24
> LO_nb
Neighbour list object:
Number of regions: 25357
Number of nonzero links: 74874
Percentage nonzero weights: 0.01164489
Average number of links: 2.952794
> lw <- spdep::nb2listw(LO_nb)
> system.time(res <- spatialreg::lagsarlm(log(price) ~ TLA + frontage + 
+ rooms + yrbuilt, data=house, listw=lw, method="Matrix"))
    user  system elapsed
   0.606   0.011   0.631

so less than 1 second on a standard laptop for similar to symmetric very 
sparse weights and ~ 25000 observations. Less sparse weights take somewhat 
longer. The function does not (maybe yet) prevent users trying to do 
things that are not advisable, because maybe they have 128GB RAM or more, 
and want to use eigenvalues rather than sparse matrix methods.

Hope this clarifies,

Roger

>
> I tried to run a model based on the dataset „columbus“, and there I did 
> not have any problems (but there are way fever entries and variables). I 
> also compared the format of the required inputs, but everything seemed 
> to be equivalent to the inputs used for the „columbus“ model.
>
> Do you have any idea what might be the reason for the extremely long 
> (respectively infinite, it has not terminated yet) computation time? Any 
> suggestions are greatly appreciated.
>
> If you would like to, I can also upload the corresponding code. However, 
> the code includes some MAT-Files as I got the data in MATLAB. I do not 
> yet attach them here because I read that attachments in another format 
> than PDF are not desired as they may contain malicious software.
>
> Thank you for your help in advance!
>
> Best regards,
>
> Raphael Mesaric
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the R-sig-Geo mailing list