[R-sig-Geo] efficient code/function for rectangular SP weight Matrix and gwr
Stéphane Dray
dray at biomserv.univ-lyon1.fr
Fri May 11 17:55:01 CEST 2007
Hi Sam,
I think that this question is quite general and could interest other
people, including me, with very different aims. I have developed a
method to look for the relationships between two data sets that have
been sampled on the same area but for different locations. In my
example, the two samples are two polygons layers. In this approach, I
compute a rectangular weighting matrix where each weight correspond to
the area of intersection between polygons of each layer. I have used
also the matrix form to store these weights (my data set was very small
compared to you). I remember that Roger was also interested by these
rectangular weights in another context. Here we have different problems:
- how to compute these kind of weights
- how to store them.
For the first point, I think that for each method/application, the
solution is different. We could develop/extend classical tools for
square weights (one set of spatial units) to rectangular weights (two
sets of spatial units).
For the second one, It would be probably interesting to define a class
of object in spdep. nb objects are lists, and I think that it would be
the solution for rectangular neighborhood.
If I consider two sets of spatial units (A and B) where the number of
units is equal to na and nb. We could store the neighbors in a list of
length 2. The first element of this list is a list of length na. In this
list, the j-th element is a vector of the neighbors of the j-th unit of
the layer A. These neighbors are spatial units of the layer B. The
second element of the global list is a list of length nb where each
element is a vector of neighbors.
I think that we have to think to a class of object that could be useful
for everybody dealing with this kind of rectangular weights. If this
class is properly defined (second point), we could then develop tools to
construct this kind of neighborhoods (first point). The eventual
extension to more than two data sets could also be taken into account in
this reflexion.
Cheers,
Sam Field wrote:
> List,
>
> I need to create a rectangular spatial weight matrix for a set of n and
> m objects. I quickly run in to memory allocation problems when
> constructing the full matrix in a single pass. I am looking for a more
> efficient way of doing this. There appears to be efficient procedures in
> spdep for constructing SQUARE spatial weight matrices (e.g.
> dnearneigh()). Are there analogous procedures for constructing distance
> based weights between two different point patterns? I am doing this in
> preparation for implementing an approximate geographically weighted
> logistic regression procedure. I was thinking about using re sampling
> procedure as an inferential frame- perhaps I might get some feedback.
> This is what I was going to do.
>
> I have a point pattern of 30,000 diabetic people based on where they
> lived during a 2 year period. During that period, approximately 4% of
> them developed diabetes. I am interested in isolating the impact of
> ecological factors on the geographic variation" of the disease, so it is
> necessary to control for the spatial clustering of individual level risk
> factors associated with the disease (diabetes).
>
> Step 1: Estimate a logistic regression using the full sample and predict
> incidence diabetes using individual level covariates (i.e. who developed
> diabetes over the two year period).
>
> Step 2. Estimate a weighted logit model at each location (grid). The
> observations would be the people (not the geographic units) and the
> weights would be kernel weights based on distance. The model would only
> contain a single freely estimated parameter, the intercept, but it would
> also contain an offset term. For each patient, the offset term would
> simply be an evaluation of the linear predictor of the global model
> estimated above (based on the observed covariate values), but without
> the intercept. This would effectively fix the estimates of the patient
> level coefficients to their global values, requiring only a local
> estimate of the intercept. My hope is that I could interpret geographic
> variability in the intercept as evidence for a "location effect" net of
> the patient composition or "risk profile" at a particular location. It
> would probably make sense to center the X variables so that the
> intercept was interpretable and estimated in a region of the response
> plane where their is plenty of data. I would let the other covariates
> vary as well, but I doubt the model could be estimated in large portions
> of the study area because of sparse data.
>
> Step 3. If I were going to do inference on the location specific
> intercepts, I would generate a sampling distribution at each location by
> re sampling from the global model, and repeat Step 2 for each randomly
> drawn sample. This would give me a local sampling distribution of
> intercept estimates at each location and I could compare it to the the
> single one generated from the observed data. The global model represents
> a kind of null because the intercept is fixed to its global value and
> geographic variability is driven entirely by the spatial clustering of
> patient level factors.
>
>
> thanks!
>
> Sam
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
--
Stéphane DRAY (dray at biomserv.univ-lyon1.fr )
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
Tel: 33 4 72 43 27 57 Fax: 33 4 72 43 13 88
http://biomserv.univ-lyon1.fr/~dray/
More information about the R-sig-Geo
mailing list