[R-sig-Geo] overlapping polygons in shapefile

Wed Mar 12 12:19:48 CET 2014

Perhaps, rangeMapper can be of use. It is using spsample(, type = 
"regular") and over()  so it is similar with Roger's example below.

# Species richness of American wrens -----
require(rangeMapper); require(RSQLite); require(rgdal)
dbcon = rangeMap.start(file = "test.sqlite", overwrite = TRUE, dir = 
tempdir() )
f = system.file(package = "rangeMapper", "extdata", "wrens", 
"vector_combined")
r = readOGR(f, "wrens")
global.bbox.save(con = dbcon, bbox = f)
gridSize.save(dbcon, gridSize = 2) ; canvas.save(dbcon)
processRanges(spdf = r, con =  dbcon, ID = "sci_name" )
rangeMap.save(dbcon) # FUN is missing so species richness is computed by 
default. see ?rangeMap.save for other args
m = rangeMap.fetch(dbcon)
plot(m)
####

Mihai

On 3/12/2014 11:00 AM, Roger Bivand wrote:
> Initial example below, please scroll down:
>
> On Wed, 12 Mar 2014, Roger Bivand wrote:
>
>> On Wed, 12 Mar 2014, Alice C. Hughes wrote:
>>
>> Please do not take threads off-list - I am not offering private 
>> advice, and what is said on the list is intended to be public so that 
>> others can contribute.
>>
>>> Hi
>>>
>>> Thanks so much for the informative answer-but I'm not certain if it 
>>> will yield what I am looking for-basically I have distribution range 
>>> polygons for multiple species and I want to collapse them down to 
>>> give the number or species sugested to be in an area-whether that be 
>>> 1 or 1000 in map form-so basically you have a "heat-map" or species 
>>> occurences: which tool would yield this? I have all the polygons in 
>>> one shapefile at present in geographic projection and WGS 84, 
>>> decimal degrees Thanks so much for your help
>>
>> The whole point of using R is that users are encouraged to create and 
>> modify tools from parts (aka functions or methods). Most research 
>> problems are coerced into fixed subsets of problems that can be 
>> tackled with "other people's" tools. If researchers are to take 
>> responsibility for actually tackling research questions with the best 
>> alignment of data collection, data representation, workflow in data 
>> handling, analysis, it is their respopnsibility to ensure that the 
>> tools are as well aligned with the problem as possible. Otherwise, 
>> your ability to infer about the actual processes will have been 
>> amputated by the limitations imposed by using tools not matching your 
>> problem.
>>
>> Change of support is particularly important here, so your ourput map 
>> of species incidence (don't call it a heatmap, it isn't measuring 
>> heat) depends crucially on steps in the workflow (for example, what 
>> is the error involved in delineating the range polygons?).
>>
>> I was mislead by your phrasing to think that you also needed the 
>> areas of each species x species overlap (counts are a bit trivial), 
>> and you haven't said what you mean by area (is it a measure, a 
>> polygon, or a raster cell?). What does: "number o[f] species sugested 
>> to be in an area" mean? Do you mean a raster cell? Or the polygonal 
>> intersection output for each unique count? Or what?
>>
>> gOverlaps(..., byid=TRUE) should give counts (it returns a matrix, 
>> and the rows/columns should be the counts if the polygons are 
>> species). However, if you have many polygons, do use the STR tree 
>> approach only to search for overlaps among polygons that actually 
>> have intersecting bounding boxes. You can count the number of 
>> overlaps in this way. If someone can provide random overlapping 
>> polygons in a SpatialPolygons object, it would be easier to show. All 
>> R lists do ask for reproducible cases with built-in or simulated data.
>
> A first cut at an example:
>
> library(sp)
> library(rgeos)
> box <- readWKT("POLYGON((0 0, 0 1000, 1000 1000, 1000 0, 0 0))")
> plot(box)
> set.seed(1)
> pts <- spsample(box, n=2000, type="random")
> pols <- gBuffer(pts, byid=TRUE, width=50)
> plot(pols, add=TRUE)
> STR <- gBinarySTRtreeQuery(pols, pols)
> length(STR)
> summary(sapply(STR, length))
> res <- vector(mode="list", length=length(STR))
> for(i in seq(along=STR)) res[[i]] <- gOverlaps(pols[i], pols[STR[[i]]],
>   byid=TRUE)
> summary(sapply(res, sum)-1) # to remove self-counting, i overlaps i
>
> So then we have # overlap counts per input polygon - but what are the 
> output reporting units? Should we actually be counting the number of 
> polygons that a fine grid of points over box belong to? If we focus on 
> this, then maybe:
>
> GT <- GridTopology(c(0.5, 0.5), c(1, 1), c(1000, 1000))
> SG <- SpatialGrid(GT)
> o <- over(SG, pols, returnList=TRUE)
> ct <- sapply(o, length)
> summary(ct)
> SGDF <- SpatialGridDataFrame(SG, data=data.frame(ct=ct))
> spplot(SGDF, "ct", col.regions=bpy.colors(20))
>
> is closer? I haven't checked why the overlaps and the grid over counts 
> differ - homework for someone? How does the output resolution affect 
> the detected number of hits?
>
> Roger

-- 
+-----------------------------------------------+
   Mihai Valcu, PhD
   Max Planck Institute for Ornithology
   Behavioural Ecology and Evolutionary Genetics
   Eberhard-Gwinner Street 7
   D-82319 Seewiesen
	o Phone:  +49-(0)8157-932-343
	o Fax:    +49-(0)8157-932-400
	o Mobile: +49-(0)1522-363-0828
	o http://www.orn.mpg.de/valcu
+-----------------------------------------------+