[R-sig-Geo] Inference of local Gi*

Sat Apr 25 00:11:31 CEST 2020

Dear Anaïs.

I am sure more experienced members will give you a better answer, but until
that I will try to help.

1) If I understood correctly, the spatial objects have 15 000 and 30 000
points in each case study, respectively. If this is the case, I am afraid
that nb objects of such large datasets surely would have an impact on the
system performance when used in subsequent tasks. The best I can suggest is
to try some sort of spatial binning if possible (e.g. hexbins), but at the
same time accounting for the modifiable areal unit problem.

2) The spdep:localG help page states that "For inference, a Bonferroni-type
test is suggested in the references, where tables of critical values may be
found". The source mentioned is free access, and can be found here:

Ord, J. K. and Getis, A. 1995 Local spatial autocorrelation statistics:
distributional issues and an application. Geographical Analysis, 27, 286–306
https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1538-4632.1995.tb00912.x

Standard measures (critical values) for selected percentiles and number of
entities, are included in Table 3 of the cited reference. Since the values
returned from localG are Z-values, you can use them to determine whether
the critical value chosen is exceeded and thus infer significant local
spatial association for each entity.

Kind regards.
José

El vie., 24 abr. 2020 a las 14:00, Anaïs Ladoy (<anais.ladoy using epfl.ch>)
escribió:

> Dear list members,
>
> I'm currently working on a point dataset, from which I want to conduct
> a Hot Spot Analysis with local Gi* statistics (Getis-Ord).
>
> I'm trying to find a way of computing its significance. I see two ways
> of computing significance in this case:
>
> 1) Compare the obtained local Gi from spdep::localG to a normal
> distribution. But here I have several questions :
> a) In my first case study (BMI value of 15 000 participants in a cohort
> study), the distribution of local Gi is far from normal (it is bimodal
> with a mode around very negative values and a mode around 0). However,
> I do need a normal distribution of Gi in order to compare it with a
> normal distribution, right? Or am I missing something here? What should
> I do in this case?
> b) In my second case study (Years of life lost for 30 000 individuals),
> the distribution of Gi returned by spdep::localG is approximately
> normal but the standard deviation is far from 1. In fact, in
> spdep::localG, the Gi values are supposedly standardized (from what I
> understood using an analytical mean and variance). Should I use these
> to compare to a normal distribution, or should I use raw G values
> (using return_internals=TRUE) and standardize them with the observed
> mean and variance of Gi? Does it cause a problem that my observed
> variance does not match the analytical variance?
>
> 2) Compute permutations
> However this is not implemented in R for localG. I tried using PySAL
> but the initial file is big and the weight file is huge, and my
> computer crashes. Any thoughts to solve this issue?
>
> Thank you for any feedback.
> Kind regards,
> Anaïs
>
> --
> Anaïs Ladoy
> PhD student, Laboratory of Geographic Information Systems, Swiss
> Federal Institute of Technology in Lausanne (EPFL), Switzerland.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
*José Ramón Martínez Batlle*
*Investigador/Profesor Universidad Autónoma de Santo Domingo (UASD)*
Correo electrónico: jmartinez19 using uasd.edu.do
Página web: http://geografiafisica.org

	[[alternative HTML version deleted]]