[R-sig-Geo] efficient code/function for rectangular SP weight Matrix and gwr
Sam Field
fieldsh at mail.med.upenn.edu
Fri May 11 16:55:20 CEST 2007
List,
I need to create a rectangular spatial weight matrix for a set of n and
m objects. I quickly run in to memory allocation problems when
constructing the full matrix in a single pass. I am looking for a more
efficient way of doing this. There appears to be efficient procedures in
spdep for constructing SQUARE spatial weight matrices (e.g.
dnearneigh()). Are there analogous procedures for constructing distance
based weights between two different point patterns? I am doing this in
preparation for implementing an approximate geographically weighted
logistic regression procedure. I was thinking about using re sampling
procedure as an inferential frame- perhaps I might get some feedback.
This is what I was going to do.
I have a point pattern of 30,000 diabetic people based on where they
lived during a 2 year period. During that period, approximately 4% of
them developed diabetes. I am interested in isolating the impact of
ecological factors on the geographic variation" of the disease, so it is
necessary to control for the spatial clustering of individual level risk
factors associated with the disease (diabetes).
Step 1: Estimate a logistic regression using the full sample and predict
incidence diabetes using individual level covariates (i.e. who developed
diabetes over the two year period).
Step 2. Estimate a weighted logit model at each location (grid). The
observations would be the people (not the geographic units) and the
weights would be kernel weights based on distance. The model would only
contain a single freely estimated parameter, the intercept, but it would
also contain an offset term. For each patient, the offset term would
simply be an evaluation of the linear predictor of the global model
estimated above (based on the observed covariate values), but without
the intercept. This would effectively fix the estimates of the patient
level coefficients to their global values, requiring only a local
estimate of the intercept. My hope is that I could interpret geographic
variability in the intercept as evidence for a "location effect" net of
the patient composition or "risk profile" at a particular location. It
would probably make sense to center the X variables so that the
intercept was interpretable and estimated in a region of the response
plane where their is plenty of data. I would let the other covariates
vary as well, but I doubt the model could be estimated in large portions
of the study area because of sparse data.
Step 3. If I were going to do inference on the location specific
intercepts, I would generate a sampling distribution at each location by
re sampling from the global model, and repeat Step 2 for each randomly
drawn sample. This would give me a local sampling distribution of
intercept estimates at each location and I could compare it to the the
single one generated from the observed data. The global model represents
a kind of null because the intercept is fixed to its global value and
geographic variability is driven entirely by the spatial clustering of
patient level factors.
thanks!
Sam
More information about the R-sig-Geo
mailing list