[R-sig-Geo] A package for spatial data classes: request for comments

Edzer J. Pebesma e.pebesma at geog.uu.nl
Fri Oct 31 12:29:48 CET 2003


Much like time, spatial locations are not likely to change for observed
data. Still, R has no infrastructure or knowledge about spatial locations.

Many packages that deal with spatial data in R use their own classes
to deal with spatial data. On the workshop on spatial statistics software
held during DSC2003 (organized by Roger Bivand), we decided that
a base class that defines classes for spatial data should be helpful,
both as an exchange platform and (later) as a required package for
working with spatial data (compare the ts class for time series).

I started writing such a package, and opened a sourceforge.net project
for it, called r-spatial. I first worked with Barry Rowlinson's
spatial.data.frame (found in r-asp, or rasp, also on sourceforge),
which is  an S3 class. Then I restarted using S4 classes, because
they are here to stay, and allow validation.

Currently I defined three classes:

+ SpatialData
    +-- SpatialDataFrame
        +-- SpatialDataFrameGrid

SpatialData is only meant as a base class; it only contains a bounding
box (2D or 3D data), and information about projection (if present),
anticipating (re)projecting facilities in Roger's proj4R package.

SpatialDataFrame extends this class; holds a data frame, and the
information where in the data frame the coordinates (2D or 3D) are
stored.

SpatialDataFrameGrid extends SpatialDataFrame for the case where
the data are on a regular 2D/3D grid; it contains the offset of the grid,
the cellsize in (x,y,z) and the nr of row/cols/layers. One simple
way of creating these classes (idea taken from Barry) is:

 > data(meuse.grid) # data frame with gridded data
 > coordinates(meuse.grid) = c("x", "y") # promote to SpatialDataFrame
 > gridded(meuse.grid) = TRUE # promote to SpatialDataFrameGrid

in the last expression, the grid topology is auto-detected and stored.

Missing values for coordinates are not allowed.

This now works; I need to add more tests for a lot of pathetic cases.

To be really useful, the class should include vector (polygon) data,
and probably line elements. For this, I need help. The simples approach
would be to extend SpatialDataFrame to SpatialDataFramePolygon,
and add for each row add the corresponding polygon.

Questions:
- did I overlook important things in the current proposal?
- is there some class in a package that serves as a good
starting point for vector/polygon data?
- which information should be stored for each polygon (if I look
at class "Map" in maptools, there is a lot!)
- should we anticipate exchangeability with shapefiles, and
store everything needed for them right at the start? If yes,
how to deal with much simpler representations such as in
packages maps?
- should we work to two extensions of SpatialDataFrame,
first simply with the polygons, a second with all the
shapefile information requirements?
- will there ever be a need to export R data as shape files?
if not, which part of information in shapefiles may be ignored?
- Do we need another name, instead of the current SpatialCls?


You can download SpatialCls by cvs; use:

export CVS_RSH=ssh
cvs -d:pserver:anonymous at cvs.sf.net:/cvsroot/r-spatial login
# press return on the password prompt
cvs -d:pserver:anonymous at cvs.sf.net:/cvsroot/r-spatial co SpatialCls

If you are interested in becoming a co-developer, please join!
--
Edzer




More information about the R-sig-Geo mailing list