[R-sig-Geo] Help with creating a weight matrix for point dataset of 78, 000 observations
Roger.Bivand at nhh.no
Mon Jun 17 10:12:46 CEST 2013
On Sun, 16 Jun 2013, Susan Gorelick wrote:
> I need to create a weight matrix in R for point dataset of 78,000
> observations from a shapefile( from ArcGIS) which has both sets of
> longitudes and latitudes and X-Y coordinates (based on state plane feet).
> So far I projected the dataset in ESRI coordinates using rgdal maptools.
Your data are point data. Did you look at a summary() of the object after
reading it in? You would look at summary in Stata, so do the same in R.
> I keep getting error message as below:
> shapeR_SP <-spTransform(shapeR, CRS("+init=ESRI:102730"))
Did you check what the CRS of the imported object was?
>> IDs <-row.names(as(shapeR_SP, "data.frame"))
>> shapeR_nbq <-poly2nb(shapeR_SP)
> Error: extends(class(pl), "SpatialPolygons") is not TRUE
You did read ?poly2nb, didn't you? You have data with point support, so if
you really want to treat them as polygons, you would go through a
triangulation first - then tri2nb?
>> shapeR_kn6 <-knn2nb(knearneigh(coords, k=6), row.names=IDs)
> Warning message:
> In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return
> = TRUE, :
> there is no package called 'RANN'
> where is 'RANN' from?!
Please do understand how R packages work. Do visit the spdep page on CRAN
- you'll see that RANN (a clever implementation of fast approximate
nearest neighbours) is "Suggests:". If spdep had "Depends:" on RANN, it
would have been installed automatically when you installed spdep; when
packages are "Suggests:", they are optional extras. In your case with
point support, RANN will be faster than using the fallback, so install it.
This is just as in using additional, user-contributed code in Stata, the
user chooses what to install.
> The outcome goal is to address the spatial dependence in hedonic model with
> house sales transaction study. I like to run the spatial dependence test so
> that I can refine the spatial regession analysis.
> I would appreciate any suggestions and/or answers to the following
> 1. What is the sample size capacity of R (weight matrix size)? This
> is the main reason I switched back to R from Stata. Is there any means to
> achieve the outcomes I am looking for?
> 2. Do I need to install.view ("Spatial)? When I tried to
> install.view("Spatial"), I've been getting a message that that package is
> not available for R 3.0 which I am using - R studio. Is "Spatial" different
> from "spatial" package? Do I need to find a different version of R?
No, but you should read up on and understand how R handles contributed
packages. It may be that R Studio is getting in the way adding an extra
layer of stuff between you and the console - it will work fine when you
know what you are doing, but will not help you to learn, it is intended
to support programming in R when the programmer knows what to do.
> 3. Is there a better to achieve the outcome? For instance, using
> GeoDa to create a weight matrix?
Generating neighbour objects for 78k points is not a problem:
xy <- matrix(runif(78000*2), ncol=2)
system.time(nb6 <- knn2nb(knearneigh(xy, k=6)))
# user system elapsed
# 7.093 0.047 7.142
You may of course check this by creating k6 neighbours in GeoDa from the
projected point shapefile, and reading the resulting GAL file into R.
However, this is an asymmetric neighbour list, so fitting ML models will
be done accurately with method="LU", which will be slower than
method="Matrix" using updating spare Cholesky log determinants for
symmetric neighbours. You may also use method="MC" for Monte Carlo log
determinant approximations. The default method="eigen" will have memory
problems with your dense 78k by 78k matrix.
You should also be able to use the full range of GM estimators (as you
would in Stata). Check by exporting the neighbours generated in R to Stata
lw <- nb2listw(nb, style="W")
# for row standardised weights
sn <- listw2sn(lw)
spmat import W using nb_w.gwt, geoda replace
spreg gs2sls Y <your Xs>, id(ID) dlmat(W) elmat(W)
See for details:
Hope this clarifies,
> Thank you.
> Susan Shim Gorelick
> [[alternative HTML version deleted]]
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo