[R-sig-Geo] Inference of local Gi*

Anaïs Ladoy @n@|@@|@doy @end|ng |rom ep||@ch
Mon Apr 27 23:29:49 CEST 2020


Dear José and Roger, 
Thank you very much for your answers! Your detailed explanations are
really helpful and I will take your recommendations to continue my
research work.
Kind regards,Anaïs

On Mon, 2020-04-27 at 11:04 +0200, Roger Bivand wrote:
> On Sat, 25 Apr 2020, Jose Ramon Martinez Batlle wrote:
> Dear Anaïs.
> I am sure more experienced members will give you a better answer, but
> untilthat I will try to help.
> 1) If I understood correctly, the spatial objects have 15 000 and 30
> 000points in each case study, respectively. If this is the case, I am
> afraidthat nb objects of such large datasets surely would have an
> impact on thesystem performance when used in subsequent tasks. The
> best I can suggest isto try some sort of spatial binning if possible
> (e.g. hexbins), but at thesame time accounting for the modifiable
> areal unit problem.
> 2) The spdep:localG help page states that "For inference, a
> Bonferroni-typetest is suggested in the references, where tables of
> critical values may befound". The source mentioned is free access,
> and can be found here:
> Ord, J. K. and Getis, A. 1995 Local spatial autocorrelation
> statistics:distributional issues and an application. Geographical
> Analysis, 27, 286–306
> https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1538-4632.1995.tb00912.x
> Standard measures (critical values) for selected percentiles and
> number ofentities, are included in Table 3 of the cited reference.
> Since the valuesreturned from localG are Z-values, you can use them
> to determine whetherthe critical value chosen is exceeded and thus
> infer significant localspatial association for each entity.
> Thanks, José, you are quite correct that false discovery rate
> problems are among the main reasons why so-called "hot-spot" analyses
> may be very misleading, in appearing to give an inferential basis for
> apparent map pattern.
> In our survey paper with David Wong referenced on ?localG, 
> https://doi.org/10.1007/s11749-018-0599-x, we show that the
> analytical and bootstrap-based inferences are similar - the normality
> is related not to the underlying variable seen globally, but the the
> local behaviour of the statistic. For this reason, bootstrap
> permutation implementations are not included in spdep, though the
> code is available if need be. Please indicate whether users would
> like this code included for comparative purposes here or in a github
> issue on https://github.com/r-spatial/spdep/issues/.
> Further, the LOSH statistic, which is a measure of local spatial
> heteroscedasticity building on local G, provides a little insight
> into the problems raised for so-called "hot-spot" analyses by
> variability across the study area in the behaviour of the variable of
> interest. If, for example, the variable of interest is influenced by
> a background variable with a spatial pattern, we will probably find
> "hot-spots" which look like the omitted background variable on a map.
> While local G cannot take residuals of a linear model, local Moran's
> I can do so. For local G, we do not have exact case-by-case standard
> deviates; we do have these for local Moran's I as discussed in the
> article with David Wong, and they very typically reduce strongly the
> counts of apparently significant local statistcs even before
> adjusting p-values for FDR. Finally, only some local measures can
> adjust for global autocorrelation - unadjusted local measures also
> respond to the presence of global autocorrelation.
> On balance, judicious choice of class intervals in mapping a variable
> of interest may prove more helpful than trying to present wobbly
> inferences from ESDA.
> Hope this isn't too discouraging,
> Roger
> 
> 
> Kind regards.José
> El vie., 24 abr. 2020 a las 14:00, Anaïs Ladoy (<anais.ladoy using epfl.ch>
> )escribió:
> Dear list members,
> I'm currently working on a point dataset, from which I want to
> conducta Hot Spot Analysis with local Gi* statistics (Getis-Ord).
> I'm trying to find a way of computing its significance. I see two
> waysof computing significance in this case:
> 1) Compare the obtained local Gi from spdep::localG to a
> normaldistribution. But here I have several questions :a) In my first
> case study (BMI value of 15 000 participants in a cohortstudy), the
> distribution of local Gi is far from normal (it is bimodalwith a mode
> around very negative values and a mode around 0). However,I do need a
> normal distribution of Gi in order to compare it with anormal
> distribution, right? Or am I missing something here? What shouldI do
> in this case?b) In my second case study (Years of life lost for 30
> 000 individuals),the distribution of Gi returned by spdep::localG is
> approximatelynormal but the standard deviation is far from 1. In
> fact, inspdep::localG, the Gi values are supposedly standardized
> (from what Iunderstood using an analytical mean and variance). Should
> I use theseto compare to a normal distribution, or should I use raw G
> values(using return_internals=TRUE) and standardize them with the
> observedmean and variance of Gi? Does it cause a problem that my
> observedvariance does not match the analytical variance?
> 2) Compute permutationsHowever this is not implemented in R for
> localG. I tried using PySALbut the initial file is big and the weight
> file is huge, and mycomputer crashes. Any thoughts to solve this
> issue?
> Thank you for any feedback.Kind regards,Anaïs
> --Anaïs LadoyPhD student, Laboratory of Geographic Information
> Systems, SwissFederal Institute of Technology in Lausanne (EPFL),
> Switzerland.
> _______________________________________________R-sig-Geo mailing
> listR-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 
> 
> 
> 
> 

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list