[R-sig-Geo] "Merge" shapefiles

Roger Bivand Roger.Bivand at nhh.no
Fri Nov 14 11:23:01 CET 2014


On Fri, 14 Nov 2014, Tyler Frazier wrote:

> Not to belittle the spatial capabilities of R, but this sounds like a 
> function that would be better addressed with PostgreSQL/postgis. 
> Integrating r & pgsql can be a good combination.

Maybe, but clarity of thinking is perhaps what is needed, it always helps 
more than guesswork. If you already know PostGIS, you'd also need clarity 
of thinking, and the steps would be very similar, although with the 
possibility to link identical objects.

>
> Sent from my iPhone
>
>> On Nov 14, 2014, at 1:11 AM, Steven Ranney <steven.ranney at gmail.com> wrote:
>>
>> All -
>>
>> I am slowly learning more about spatial data in R.  However, I am still 
>> a relative neophyte.
>>
>> What I want to do:
>>
>> I have two shapefiles, shpA has ~401,000 individual polygons with
>> attributes.  shpB is a subset of those polygons with different attribute
>> data.  Even though shpB is a subset of those data, there may be multiple
>> rows for a given polyon, thus giving shpB more total rows (~780,000).
>>

You must decide what you want to do in detail, for instance whether these 
representations make any sense. You do not provide a motivation or an 
affiliation, which make it hard to guess your application domain (ecology, 
real estate, whatever).

You have ~401,000 individual polygons with IDs and some data, are they 
unique? Do they overlap? Are they home ranges (which may overlap), census 
blocks (which shouldn't)?

Then you have extra data that happens to be in a messy shapefile with 
repeated geometries, all of which match some of those in the the first 
data set (it never needed to be a shapefile, and probably never should 
have been). Can you match them by ID (match() is much stronger than 
merge(), because it shows you what is matching)?

Note that you expect to get >=0 matches on each geometry from the first 
object, you need to control what is going on, because the maximum number 
of matches will determine the number of columns in the output (with lots 
of missing values where there are fewer than this. Are the repeat 
geometries there because the repeats are at different times? Should you be 
trying to construct an appropriate space-time object if this is the case?


>> Effectively, I want to merge these two shapefiles.  With two dataFrame
>> objects in R, I would merge them like
>>
>> merge(shpA, shpB, by = "APN_LABEL", all = TRUE)
>>
>> but apparently, this doesn't work with shapefiles.  I have tried
>>
>> merge(shpA at data, shpB at data, by = "APN_LABEL", all = TRUE)
>>
>> which creates a dataFrame of the the two files but drops all of the spatial
>> geometries.

Yes, of course, what did you expect? The only references available say 
that there is no merge method for Spatial* objects, and you are anyway 
taking their data slots, which are data frames. If the output object has 
the same number of rows as shpA, and its row.names() matches that of shpA, 
you may have what you want (create a new SPDF object with the 
SpatialPolygons from shpA, and the output from merge as its data slot), 
but beware of merge() re-ordering rows. This is, however dependent on 
prior checking for consistency in the IDs.

>>
>> I've looked into gUnion() as it seems like that may be what I'm looking
>> for, but I get the following error:

Just fishing without understanding is always pretty hopeless. Why would 
you expect that a function that is declared to only handle geometries 
could sort out your data cleaning problem?

>>
>> tmp <- gUnion(shpA, shpB)
>> Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td,
>> "rgeos_union") :
>>  std::bad_alloc
>>
>> Ultimately, I want a shapeFile of all ~401,000 geometries in shpA that
>> includes ALL of the attribute data from shpB that may exist in multiple
>> rows for a given polygon.

Yes, but you need to think first; I'm not even sure why these polygons 
might be meaningful anyway - you didn't say. Guessing by function name 
really doesn't help. Did reading the "combine_maptools" vignette help?

http://cran.r-project.org/web/packages/maptools/vignettes/combine_maptools.pdf

Hope this clarifies,

Roger

>>
>> Is this possible?  Is this simple?
>>
>> Steven H. Ranney
>>
>>    [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list