[R-sig-Geo] Fwd: Question spdep package - lagsarlm not terminating
Roger Bivand
Roger@B|v@nd @end|ng |rom nhh@no
Mon May 6 17:20:58 CEST 2019
On Mon, 6 May 2019, Raphael Mesaric via R-sig-Geo wrote:
>
>> Dear Roger,
>>
>> Thank you very much for your fast response.
>>
>> My weight matrix stems from a rectangular (but not square) grid. So, I
>> think I will have to use the „LU“ method.
No, this is a misunderstanding. All weights matrices used for fitting
models are square nxn matrices by definition. If your grid had r rows and
c columns, n = r * c, and the weights matrix has n rows and n columns.
>>
>> However, by doing that several other error messages popped up. The
>> first one was the following:
>>
>> Error in spatialreg::errorsarlm(formula = formula, data = data, listw =
>> listw, :
>> NAs in lagged dependent variable
>> In addition: Warning messages:
>> 1: Function errorsarlm moved to the spatialreg package
>> 2: In lag.listw(listw, y, zero.policy = zero.policy) :
>> NAs in lagged values
>>
>> And this happened even though I do actually not have NaN values in my
>> dataset (I removed them in advance). When I completely reload all
>> variables, it works sometimes for a few tries but after a while the
>> error messages pops up again. Do you know why this might happen? And
>> could I theoretically use a dataset with NaN values and the function
>> omits them?
>>
By default, observations with missing values are dropped and the weights
object is subsetted to match. This may produce no-neighbour observations.
You can choose to set their spatially lagged values to zero by using
zero.policy=TRUE, which is the probable cause of your error. It could very
well be that your weights object itself contains no-neighbour observations
if not created using spdep::nb2listw() - which will alert you to the
problem.
>>
>> The second error message was the following:
>>
>> Error in solve.default(-(mat), tol.solve = tol.solve) :
>> system is computationally singular: reciprocal condition number =
>> 2.71727e-26
>>
>> I found a reference online (related to glm) that this might be linked
>> to explanatory variables which have a high correlation. Would
>> eliminating some of the variables do the job here as well?
>>
Please do not use references online (especially if you do not link to
them). The information you need is in the help page, referring to the
tol.solve= argument. Indeed, your covariates are scaled such that their
are either colinear (unlikely), or that the numerical values of the
coefficient variances are very different from the spatial coefficient. You
may need to re-scale the response or covariates in order to invert the
matrix needed to yield the variances of the coefficients.
>>
>> Then I have two other questions, where I couldn’t find any answers
>> online:
>>
>> Is there an option to eliminate specific rows of a nb object? I tried
>> the following command:
>>
>> rook234x259_2 <- rook234x259[index],
spdep::subset.nb() only. Your suggestion only creates a big mess, as
subsetting reduces the number of rows, and neighbours are indexed between
1 and n. Consequently, at least some of the remaining vectors will point
to neighbours with ids > n, and many of the others will point to the wrong
neighbours. You can convert to a sparse matrix, subset using "[", and then
convert back, but the subset method should work.
>>
>> where index is a vector with the row numbers I would like to keep. But
>> this converts the nb to a list object, and there is no list2nb function
>> (as much as I am aware of). And the option „subset“ does not only
>> remove the rows, but also the corresponding cells which leads to a
>> different neighborhood matrix.
I do not know what you mean, of course it has to remove links in both
directions.
>>
>>
>> The last question refers to the package spgwr. I would like to run this
>> model as well. The gwr function runs successfully, but when calling the
>> variable where I stored the result, I get the following error message
>> instead of my coefficients:
>>
>> Error in abs(coef.se <- xm[, cs.ind, drop = FALSE]) :
>> non-numeric argument to mathematical function
>> In addition: Warning messages:
>> 1: In print.gwr(x) : NAs in coefficients dropped
>> 2: In cbind(CM, coefficients(x$lm)) :
>> number of rows of result is not a multiple of vector length (arg 2)
>>
I always advise against using GWR for anything other than studies
exploring its weaknesses - do not use in research or production. Just
showing the error message without a small reproducible example using a
built-in data set gives nothing to go on. Does the error occur when using
print() on a fitted object? Maybe NAs in coefficients
suggests colinearity in your covariates, possibly after weighting.
>> Again, there aren’t any NaN values in my dataset, so I can’t really
>> imagine where the NAs in the coefficients come from.
But it is terribly easy to introduce them numerically, so they are coming
from what you are doing.
Roger
>>
>>
>> Thank you very much for your help!
>>
>> Best regards,
>>
>> Raphael
>>
>>
>>> Am 05.05.2019 um 15:21 schrieb Roger Bivand <Roger.Bivand using nhh.no>:
>>>
>>> On Sat, 4 May 2019, Raphael Mesaric via R-sig-Geo wrote:
>>>
>>>> Dear all,
>>>>
>>>> I have a question with regards to the function „lagsarlm" from the package spdep. My problem is that the function is not terminating. Of course, I have quite a big grid (depending on the selection either 34000 or 60600 entries) and I also have a lot of explanatory variables (about 40). But I am still wondering whether there is something wrong.
>>>
>>> If you read the help page (the function is in the spatialreg package and will be dropped from spdep shortly), and look at the references (Bivand et al. 2013), you will see that the default value if the method= argument is "eigen". For small numbers of observations, solving the eigenproblem of a dense weights object is not a problem, but becomes demanding on memory as n increases. You are using virual memory (and may run out of that too) which makes your machine unresponsive. If you choose an alternative method, typically "Matrix" for symmetric or similar to symmetric sparse weights, or "LU" for asymmetric sparse weights.
>>>
>>>> data(house, package="spData")
>>>> dim(house)
>>> [1] 25357 24
>>>> LO_nb
>>> Neighbour list object:
>>> Number of regions: 25357
>>> Number of nonzero links: 74874
>>> Percentage nonzero weights: 0.01164489
>>> Average number of links: 2.952794
>>>> lw <- spdep::nb2listw(LO_nb)
>>>> system.time(res <- spatialreg::lagsarlm(log(price) ~ TLA + frontage +
>>> + rooms + yrbuilt, data=house, listw=lw, method="Matrix"))
>>> user system elapsed
>>> 0.606 0.011 0.631
>>>
>>> so less than 1 second on a standard laptop for similar to symmetric very sparse weights and ~ 25000 observations. Less sparse weights take somewhat longer. The function does not (maybe yet) prevent users trying to do things that are not advisable, because maybe they have 128GB RAM or more, and want to use eigenvalues rather than sparse matrix methods.
>>>
>>> Hope this clarifies,
>>>
>>> Roger
>>>
>>>>
>>>> I tried to run a model based on the dataset „columbus“, and there I did not have any problems (but there are way fever entries and variables). I also compared the format of the required inputs, but everything seemed to be equivalent to the inputs used for the „columbus“ model.
>>>>
>>>> Do you have any idea what might be the reason for the extremely long (respectively infinite, it has not terminated yet) computation time? Any suggestions are greatly appreciated.
>>>>
>>>> If you would like to, I can also upload the corresponding code. However, the code includes some MAT-Files as I got the data in MATLAB. I do not yet attach them here because I read that attachments in another format than PDF are not desired as they may contain malicious software.
>>>>
>>>> Thank you for your help in advance!
>>>>
>>>> Best regards,
>>>>
>>>> Raphael Mesaric
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo using r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>
>>> --
>>> Roger Bivand
>>> Department of Economics, Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>>> https://orcid.org/0000-0003-2392-6140
>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
More information about the R-sig-Geo
mailing list