[R-sig-Geo] DCluster

Mon Mar 28 16:42:18 CEST 2005

Hi r-sig-geo,
I am looking at the spatial distribution of poor households in a region 
comprising a gradient of urban-rural postcodes. The data are counts and 
they fit a negative binomial distribution, rather than a poisson 
distribution.

I am applying DCluster (ver. 0.1-3, windows) and would be grateful for 
advice on a few topics.

1. GAM

A) As I understand it the default setting is based on a poisson 
distribution. This creates some not implausible clusters, but I wonder 
whether I could set the opgam, so that it uses a negative binomial 
distribution (for which I have the parameters for the ‘disease’ 
variable; size and mu) or to use a bootstrap procedure instead. Some of 
the internal functions, like opgam.iscluster.negbin, seem to support 
this, but I am uncertain about how to incorporate them.

B) To reduce the multiple testing problem (Waller & Gotway 2004, 
“Applied Spatial Statistics for Public Health Data”, Wiley, p.208) I 
wonder whether to set radius to <50% of step size, e.g. 100m radius in a 
300m grid, so that the smallest circles won't touch?

2. Besag-Newell

I am getting results with ‘poisson’ (almost everything becomes a cluster 
- possibly because the sites are clumped and not randomly distributed) 
and with ‘permutation’, but wonders how the ‘negbin’ is used? Not like 
this:

>  bnresults<-opgam(pcpoor, thegrid=pcpoor[,c("x","y")], alpha=.05,

+ iscluster=bn.iscluster, set.idxorder=TRUE, k=20, model="negbin",

+ R=100, mle=calculate.mle(pcpoor) )

> > Error in rnbinom(n, size, prob) : invalid arguments

3. Kulldorff & Nagarwalla

Again I struggle with the parameters. Not like this:

>  #K&N's method over the centroids

>  mle<-calculate.mle(pcpoor, model="negbin")

> > Error in while (((abs(m - m0) > tol * (m + m0)) || (abs(v - v0) > tol 
* :

missing value where TRUE/FALSE needed

>  knresults<-opgam(data=pcpoor, thegrid=pcpoor[,c("x","y")], alpha=.05,

+ iscluster=kn.iscluster, fractpop=.5, R=100, model="negbin", mle=mle)

> > Error in rnbinom(n, size, prob) : invalid arguments

4. Turnbull. Is Turnbull analysis possible in DCluster yet?. Some 
references in the manual, but haven’t been able to locate it.

5. General

A) I am considering increasing the study area (p.t. working with 1262 
postcode points) and wonder what the limits might be for a desktop pc. I 
gather that the distance matrices (created by tripack or spdep) could be 
a limiting factor? Would it be an idea to run this step first and once 
the table is created run the cluster detection algorithm?

B) I wonder whether permutations always are superior to standard stats. 
Distributions, and if not, then why not?

Best wishes, Jakob

Jakob Petersen
GISc student (MSc)
Birkbeck, University of London