[R-sig-Geo] spdep/splm: k-nearest neighbors, normalizations, listw2U, spatial models, methods

Tue Aug 21 23:30:59 CEST 2012

On Tue, 21 Aug 2012, Roger Bivand wrote:

> On Mon, 20 Aug 2012, Fernando Bruna Quintas wrote:
>
>> Dear Prof. Bivand and listers:
>> 
>> I started with spatial econometrics and R a few months ago and this is the 
>> first time I write here. My apologies if I miss something.
>> 
>> In am a bit confused about the final weight matrix I am using with spdep. 
>> Initially I start with an inverse distance matrix to the 5-nearest 
>> neighbors, so asymmetric general weights. I decided this matrix as my 
>> baseline because of several reasons: I research about distance, Griffith 
>> (1996) rules of thumb, inspection of the plots of links...
>> 
>> I build the listw style W (row-normalization) to apply errorsarlm or 
>> lagsarlm. But at least for method eigen, theses functions use the function 
>> listw2U which changes my listw.

I can confirm that row-standardised k=5 asymmetric weights give the same 
regression coefficients in lagsarlm() and Stata's spreg ml. Nothing 
untoward is going on, and no changes are being made to the weights.

>
> Good that you realise that you are confused! Your conclusion is wrong. If 
> (and only if) the listw object can be Ord-transformed to symmetry (an 
> underlying symmetric set of neighbours and weighting scheme "W" or "S"), the 
> Ord transformation is applies, and possible numerical (< 1e-16 differences 
> across the diagonal) fuzz removed with listw2U(). If the underlying 
> neighbours are asymmetric, the can.sim variable in the model fitting 
> functions is FALSE, so the eigenvalues are extracted from the asymmetric 
> matrix. You missed the if() statement in the code in eigen_setup() (in 
> jacobian.R), documented in ?eigen_setup
>
>> 
>> I will try to write separate questions or sentences about my doubts, though 
>> they are related:
>> 
>> Q1 - I understand that the similar.listw function 
>> (http://rgm2.lab.nig.ac.jp/RGM2/func.php?rd_id=spdep:similar.listw) is the 
>> same than listw2U but I am not sure.
>
> No, similar.listw() does the Ord transformation, listw2U does (W + W')*0.5
>
>> 
>> Q2 - I see that listw2U uses the function make.sym.nb, adding the neighbors 
>> that were asymmetric. If this is true, I miss the property of asymmetry, so 
>> I do not see the point of starting with k-nearest neighbors.
>
> The assumption is that people know what they are doing, so if the user wants 
> to impose symmetry, this is a possible choice. The function is not used as 
> much as you seem to think.

Running listw2U() and 0.5*(W + t(W)) are fully equivalent for asymmetric 
neighbours:

set.seed(1) # or your choice of seed
res <- logical(500)
for (i in seq(along=res)) {
   nb5 <- knn2nb(knearneigh(cbind(runif(100), runif(100)), k=5))
   lw5 <- nb2listw(nb5, style="W")
   m5 <- listw2mat(lw5)
   MU5 <- 0.5*(m5 + t(m5))
   lwU5 <- listw2U(lw5)
   MUU5 <- listw2mat(lwU5)
   res[i] <- all.equal(MU5, MUU5)
}
table(res)

make.sym.nb() is used to find the "missing" cross-diagonal entries.

>
>> 
>> Q3 - Similarly, I understand that these functions are applying Ord (1975) 
>> transformation, though the notation here confuses me: 
>> http://rgm2.lab.nig.ac.jp/RGM2/func.php?rd_id=spdep:lm.morantest The helper 
>> function listw2U() constructs a weights list object corresponding to the 
>> sparse matrix 1/2 (W + W').
>
> Some derivations of Moran tests are based on symmetry, and most often 
> cross-products of W are involved, making the point moot (see Cliff & Ord 
> 1973).
>

The output of lm.morantest(), and its equivalents in PySAL and OpenGeoDa 
are identical for k=5 asymmetric weights.

>> 
>> Q4 - The help in moran.test explains that for inherently non-symmetric 
>> matrices, such as k-nearest neighbour matrices, listw2U() can be used to 
>> make the matrix symmetric. Does this means that I should pass my W matrix 
>> through listw2U before Moran test?. In row-normalizatization W style or in 
>> B style with inverse distance weights?
>
> I'm travelling and cannot answer more now. I think that you have confused 
> youself more than necessary. There is no problem for model fitting, and as 
> far as I am aware, no problem for tests. Please provide worked examples 
> showing that the current code leads to results that you can show are 
> incorrect (for example giving different results from OpenGeoDa, PySAL, Matlab 
> Spatial Econometrics toolbox). Simplify your question to one point, not many 
> as now.
>
> Hope this clarifies,
>
> Roger
>
>> 
>> Q5 - In Elhorst (2010) (Fisher & Getis book, page 380) I see that Ord 
>> transformation keeps unchanged the mutual proportions between the elemets 
>> of W, which is relevant for inverse distances (Anselin 1998, 23-24). I 
>> understand that if I want to keep the interpretation of inverse distances I 
>> should apply this transformation to the listw style B, without 
>> row-normalization, with my general spatial weights of inverse distances. 
>> But the method eigen in, for instance, lagsarlm or ASDAR page 284 says that 
>> Ord normalization can be used just with W matrixes. I have tried this 
>> method to estimate spatial models with my listw in both B and W styles, but 
>> I am not sure what I am doing at the end in each case.

Ord normalization can be used on W and S style, for underlying symmetric 
weights. If the underlying weights are not symmetric, it cannot be used, 
and the eigenvalues will be complex. For comparison use the LU method, 
which can also handle intrinsically asymmetric weights.

Roger

>> 
>> Q7 - Similarly to standardization, the fact of making symmetric the matrix, 
>> loses the geographical interpretation. The unit A can be the nearest 
>> neighbors of B but not the contrary so their main links would not be 
>> reciprocal. I am not sure about in which ways the issue of symmetry in the 
>> nb list is a different issue from normalization of weights. Both things 
>> usually are discussed at the same time, but they are conceptually different
>> 
>> Q8 - If it is a problem of estimation, I can use other methods to keep my 
>> matrix asymmetric and/or with no-standardized weights. Which one would be 
>> better?.
>> 
>> Q9 - Any idea about how the splm packages deals with some of the previous 
>> issues?
>> 
>> Q10 - This question is more general. I am still not sure if I understand 
>> well the reasons for normalization. Maybe it is to ensure the estimation by 
>> keeping the eigen values in the right range. Right now I do not care about 
>> the easier interpretation of the parameters after row-normalization. My 
>> question is if normalization helps to correct spatial autocorrelation. If 
>> so, why does it helps if it, at least with row-normalization, loses the 
>> main information about the absolute distance to each neighbor?. Why the 
>> impact of each unit by all other unit should be equalized?. Is each unit 
>> equally open, equally close to the other ones?. Or this is just to avoid 
>> imposing structure on the spatial dependence so general spatial weights are 
>> not that useful (just relative distances)?
>> 
>> Any thinking about these doubts would be very appreciated. Thank you very 
>> much
>> 
>> Fernando
>> 
>>
>> 	[[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no