[R-sig-Geo] credible interval for empirical Bayesian estimates of rates
Roger Bivand
Roger@B|v@nd @end|ng |rom nhh@no
Fri Apr 24 21:04:53 CEST 2020
On Fri, 24 Apr 2020, Dexter Locke wrote:
> Thanks Roger and list.
>
> I didn't think a repex was needed because a question was: why does
> spdep::EBest(counts, population, family = 'binomial') give the same
> results at GeoDa's, while EBest(.. binomial) is "binomial" while GeoDa
> calls that "Poisson-Gamma". GeoDa can't give use a repex (GUI) and think
> this is a question about terminology. The same results were achieved with
> the packages while naming the model differently - why?
Reproducible example:
auckland <- st_read(system.file("shapes/auckland.shp",
package="spData")[1], quiet=TRUE)
res <- EBest(auckland$M77_85, 9*auckland$Und5_81)
res0 <- EBest(auckland$M77_85, 9*auckland$Und5_81, family="binomial")
> summary(res$estmm)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001487 0.002237 0.002549 0.002648 0.002968 0.004531
> summary(res0$estmm)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001484 0.002235 0.002549 0.002648 0.002968 0.004536
After calculating Und5_81 * 9 as a new column, running GeoDa -> Map ->
Rates-Caluculated Map -> Empirical Bayes and exporting a shapefile, in R I
see:
> summary(auck$R_EBS)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001487 0.002237 0.002549 0.002648 0.002968 0.004531
which is the same as the Poisson, and:
> all.equal(res$estmm, auck$R_EBS)
[1] TRUE
GeoDa is providing the same EB Poisson as EBest, isn't it? If yours
differ, are both implementations seeing the same data?
>
> Yes ?spdep::EBest directed me to the literature I'm struggling to access.
> And Yes, I've been looking at the raw code and understand how the estmm is
> generated.
>
> I've been using the epitools::pois.exact() and spdep::EBest. I can compare
> the point estimates from pois.exact to those provided by EBest, but I'd
> like to graph side by side their credible / confidence intervals.
>
> Its this last part on the credible intervals I'm interested in. How to get
> credible intervals around estmm? This is my main question.
EBest() was written to implement Bailey & Gatrell's textbook, which did
not provide CI, and just used the Marshall Auckland dataset. If you'd like
to implement them, I'd welcome a contribution.
>
> ASDAR is a reference I'm using all the time. Thanks for that gem.
>
> DCluster::empbaysmooth also does not provide a credible interval, either.
>
As you see from ch. 10, CI are described for the epitools implementation.
My feeling is that the literature moved away from simple EB rates towards
IID RE and spatially structured RE, with relevant covariates, say like the
classic Scottish Lip Cancer dataset, and currently CARBayes is a solid
package among many others. Simply using base population becomes too
unsatisfactory. PHE uses funnel plots which do have CI of a kind, to draw
attention from small base populations:
https://nhsrcommunity.com/blog/introduction-to-funnel-plots/ which can be
used to adjust class intervals for mapping.
Roger
> -Dexter
> http://dexterlocke.com/
>
>
>
> On Fri, Apr 24, 2020 at 10:23 AM Roger Bivand <Roger.Bivand using nhh.no> wrote:
>
>> On Fri, 24 Apr 2020, Dexter Locke wrote:
>>
>>> Dear esteemed list,
>>>
>>> I'm using spdep::EBest with family = 'binomial' for counts of events
>> within
>>> polygons that have an 'at risk' population. The resultant "estmm" is
>>> 'shrunk' compared to the raw rate (both given by EBest and calculated "by
>>> hand" rate. All good there.
>>>
>>> Using GeoDa version 1.14.0 24 August 2019 produces identical results for
>>> its Empirical Bayesian rate. This was confirmed by plotting the EBest
>>> output against GeoDa's rate and finding a perfect correlation along the 1
>>> to 1 line. All good there.
>>
>> Please provide a reproducible example, as this may help with answers.
>>
>>>
>>> Two questions:
>>> 1. How can credible intervals around these smoothed rate estimates be
>>> calculated?
>>> 2. The spdep documentation calls this a binomial family, but the
>> identical
>>> results are obtained from GeoDa calls this "Poisson-Gamma" model here:
>>> https://geodacenter.github.io/workbook/3b_rates/lab3b.html#fnref11 , so
>>> what is actually being calculated? This question may help me answer the
>>> first question..
>>
>> No, the default family is "poisson", with "binomial" available for
>> non-rare conditions following Martuzzi, implemented by Olaf
>> Berke, ?spdep::EBest.
>>
>> The code in spdep is easily accessible, so can be read directly. Please
>> also compare with code for the EB Moran test, and with analogous code in
>> the DCluster package, empbaysmooth(). Cf. ASDAR 2nd ed., ch. 10, section
>> 10.2, pp. 322-328. The epitools::pois.exact() function is used for CIs.
>> For code and data see https://asdar-book.org/bundles2ed/dismap_bundle.zip.
>>
>>>
>>> Possibly the answers are addressed in the literature cited which I cannot
>>> access right now at home without institutional library access.
>>>
>>
>> Most institutions do have proxy or VPN access, but the code will be as
>> useful. In PySAL, the code would also guide you, but even though GeoDa is
>> open source, the C++ is fairly dense.
>>
>> Hope this helps,
>>
>> Roger
>>
>>> Thanks for your consideration,
>>> Dexter
>>> http://dexterlocke.com/
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
More information about the R-sig-Geo
mailing list