# [R-sig-Geo] DCluster

Jakob Petersen jakob.petersen at qmul.ac.uk
Mon Mar 28 16:42:18 CEST 2005

```Hi r-sig-geo,
I am looking at the spatial distribution of poor households in a region
comprising a gradient of urban-rural postcodes. The data are counts and
they fit a negative binomial distribution, rather than a poisson
distribution.

I am applying DCluster (ver. 0.1-3, windows) and would be grateful for

1. GAM

A) As I understand it the default setting is based on a poisson
distribution. This creates some not implausible clusters, but I wonder
whether I could set the opgam, so that it uses a negative binomial
distribution (for which I have the parameters for the ‘disease’
variable; size and mu) or to use a bootstrap procedure instead. Some of
the internal functions, like opgam.iscluster.negbin, seem to support
this, but I am uncertain about how to incorporate them.

B) To reduce the multiple testing problem (Waller & Gotway 2004,
“Applied Spatial Statistics for Public Health Data”, Wiley, p.208) I
wonder whether to set radius to <50% of step size, e.g. 100m radius in a
300m grid, so that the smallest circles won't touch?

2. Besag-Newell

I am getting results with ‘poisson’ (almost everything becomes a cluster
- possibly because the sites are clumped and not randomly distributed)
and with ‘permutation’, but wonders how the ‘negbin’ is used? Not like
this:

>  bnresults<-opgam(pcpoor, thegrid=pcpoor[,c("x","y")], alpha=.05,

+ iscluster=bn.iscluster, set.idxorder=TRUE, k=20, model="negbin",

+ R=100, mle=calculate.mle(pcpoor) )

> > Error in rnbinom(n, size, prob) : invalid arguments

3. Kulldorff & Nagarwalla

Again I struggle with the parameters. Not like this:

>  #K&N's method over the centroids

>  mle<-calculate.mle(pcpoor, model="negbin")

> > Error in while (((abs(m - m0) > tol * (m + m0)) || (abs(v - v0) > tol
* :

missing value where TRUE/FALSE needed

>  knresults<-opgam(data=pcpoor, thegrid=pcpoor[,c("x","y")], alpha=.05,

+ iscluster=kn.iscluster, fractpop=.5, R=100, model="negbin", mle=mle)

> > Error in rnbinom(n, size, prob) : invalid arguments

4. Turnbull. Is Turnbull analysis possible in DCluster yet?. Some
references in the manual, but haven’t been able to locate it.

5. General

A) I am considering increasing the study area (p.t. working with 1262
postcode points) and wonder what the limits might be for a desktop pc. I
gather that the distance matrices (created by tripack or spdep) could be
a limiting factor? Would it be an idea to run this step first and once
the table is created run the cluster detection algorithm?

B) I wonder whether permutations always are superior to standard stats.
Distributions, and if not, then why not?

Best wishes, Jakob

Jakob Petersen
GISc student (MSc)
Birkbeck, University of London

```