[R-sig-Geo] bivariate spatial correlation in R
Rafael Pereira
rafa.pereira.br at gmail.com
Sun Jul 30 00:38:49 CEST 2017
Hi all,
here is a reproducible example to calculate in R bivariate Moran's I and
LISA clusters. This example is based on a this answer provided in SO* and
it uses a toy model of my data. The R script and the shape file with the
data are available on this link.
https://gist.github.com/rafapereirabr/5348193abf779625f5e8c5090776a228
What this example does is to estimate the spatial association between
household income per capita and the gains in accessibility to jobs. The aim
is to analyze who benefits the recent changes in the transport system in
terms of access to jobs. So the idea is not to find causal relationships,
but spatial association between areas of high/low income who had high/low
gains in accessibility.
The variables in the data show info on the proportion of jobs accessible in
both years 2014 and 2017 (access2014, access2017) and the difference
between the two years in percentage points (diffaccess).
Roger, I know you have shown to be a bit sceptical about this application
of bivariate Moran's I. Do you still think a spatial regression would be
more appropriate?
Also, I would be glad to hear if others have comments on the code. This
function is not implemented in any package so it would be great to have
some feedback.
Rafael H M Pereira
urbandemographics.blogspot.com
*
https://stackoverflow.com/questions/45177590/map-of-bivariate-spatial-correlation-in-r-bivariate-lisa
On Wed, Jul 26, 2017 at 11:07 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
> On Wed, 26 Jul 2017, Rafael Pereira wrote:
>
> Roger,
>>
>> This example was provided only for the sake or making the code easily
>> reproducible for others and I'm more interested in how the bi-variate
>> Moran
>> could be implemented in R, but your comments are very much welcomed and
>> I've made changes to the question.
>>
>> My actual case study looks at bi-variate spatial correlation between (a)
>> average household income per capita and (b) proportion of jobs in the city
>> that are accessible under 60 minutes by transit. I don't think I could use
>> rates in this case but I will normalize the variables using
>> scale(data$variable).
>>
>
> Please provide a reproducible example, either with a link to a data
> subset, or using a builtin data set. My guess is that you do not need
> bi-variate spatial correlation at all, but rather a spatial regression.
>
> The "causal" variable would then the the proportion of jobs accessible
> within 60 minutes by transit, though this is extremely blunt, and lots of
> other covariates (demography, etc.) impact average household income per
> capita (per block/tract?). Since there are many missing variables in your
> specification, any spatial correlation would be most closely associated
> with them (demography, housing costs, education, etc.), and the choice of
> units of measurement would dominate the outcome.
>
> This is also why bi-variate spatial correlation is seldom a good idea, I
> believe. It can be done, but most likely shouldn't, unless it can be
> motivated properly.
>
> By the way, the weighted and FDR-corrected SAD local Moran's I p-values of
> the black/white ratio for Oregon (your toy example) did deliver the goods -
> if you zoom in in mapview::mapview, you can see that it detects a rate
> hotspot between the rivers.
>
> Roger
>
>
>
>> best,
>>
>> Rafael H M Pereira
>>
>> On Mon, Jul 24, 2017 at 7:56 PM, Roger Bivand <Roger.Bivand at nhh.no>
>> wrote:
>>
>> On Mon, 24 Jul 2017, Rafael Pereira wrote:
>>>
>>> Hi all,
>>>
>>>>
>>>> I would like to ask whether some you conducted bi-variate spatial
>>>> correlation in R.
>>>>
>>>> I know the bi-variate Moran's I is not implemented in the spdep library.
>>>> I left a question on SO but also wanted to hear if anyone if the
>>>> mainlist
>>>> have come across this.
>>>> https://stackoverflow.com/questions/45177590/map-of-bivariat
>>>> e-spatial-correlation-in-r-bivariate-lisa
>>>>
>>>> I also know Roger Bivand has implemented the L index proposed by Lee
>>>> (2001)
>>>> in spdep, but I'm not I'm not sure whether the L local correlation
>>>> coefficients can be interpreted the same way as the local Moran's I
>>>> coefficients. I couldn't find any reference commenting on this issue. I
>>>> would very much appreciate your thoughts this.
>>>>
>>>>
>>> In the SO question, and in the follow-up, your presumably throw-away
>>> example makes fundamental mistakes. The code in spdep by Virgilio
>>> Gómez-Rubio is for uni- and bivariate L, and produces point values of
>>> local
>>> L. This isn't the main problem, which is rather that you are not taking
>>> account of the underlying population counts, nor shrinking any estimates
>>> of
>>> significance to accommodate population sizes. Population sizes vary from
>>> 0
>>> to 11858, with the lower quartile at 3164 and upper 5698:
>>> plot(ecdf(oregon.tract$pop2000)). Should you be comparing rates in
>>> stead?
>>> These are also compositional variables (sum to pop2000, or 1 if rates)
>>> with
>>> the other missing components. You would probably be better served by
>>> tools
>>> examining spatial segregation, such as for example the seg package.
>>>
>>> The 0 count populations cause problems for an unofficial alternative, the
>>> black/white ratio:
>>>
>>> oregon.tract1 <- oregon.tract[oregon.tract$white > 0,]
>>> oregon.tract1$rat <- oregon.tract1$black/oregon.tract1$white
>>> nb <- poly2nb(oregon.tract1)
>>> lw <- nb2listw(nb)
>>>
>>> which should still be adjusted by weighting:
>>>
>>> lm0 <- lm(rat ~ 1, weights=pop2000, data=oregon.tract1)
>>>
>>> I'm not advising this, but running localmoran.sad on this model output
>>> yields SAD p-values < 0.05 after FDR correction only in contiguous tracts
>>> on the Washington state line in Portland between the Columbia and
>>> Williamette rivers. So do look at the variables you are using before
>>> rushing into things.
>>>
>>> Hope this clarifies,
>>>
>>> Roger
>>>
>>>
>>> best,
>>>>
>>>> Rafael HM Pereira
>>>> http://urbandemographics.blogspot.com
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>> --
>>> Roger Bivand
>>> Department of Economics, Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
>>> Editor-in-Chief of The R Journal, https://journal.r-project.org/
>>> index.html
>>> http://orcid.org/0000-0003-2392-6140
>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
> Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
> http://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list