[R-sig-Geo] overlapping polygons in shapefile

Roger Bivand Roger.Bivand at nhh.no
Wed Mar 12 11:00:32 CET 2014


Initial example below, please scroll down:

On Wed, 12 Mar 2014, Roger Bivand wrote:

> On Wed, 12 Mar 2014, Alice C. Hughes wrote:
>
> Please do not take threads off-list - I am not offering private advice, and 
> what is said on the list is intended to be public so that others can 
> contribute.
>
>> Hi
>> 
>> Thanks so much for the informative answer-but I'm not certain if it will 
>> yield what I am looking for-basically I have distribution range polygons 
>> for multiple species and I want to collapse them down to give the number or 
>> species sugested to be in an area-whether that be 1 or 1000 in map form-so 
>> basically you have a "heat-map" or species occurences: which tool would 
>> yield this? I have all the polygons in one shapefile at present in 
>> geographic projection and WGS 84, decimal degrees Thanks so much for your 
>> help
>
> The whole point of using R is that users are encouraged to create and modify 
> tools from parts (aka functions or methods). Most research problems are 
> coerced into fixed subsets of problems that can be tackled with "other 
> people's" tools. If researchers are to take responsibility for actually 
> tackling research questions with the best alignment of data collection, data 
> representation, workflow in data handling, analysis, it is their 
> respopnsibility to ensure that the tools are as well aligned with the problem 
> as possible. Otherwise, your ability to infer about the actual processes will 
> have been amputated by the limitations imposed by using tools not matching 
> your problem.
>
> Change of support is particularly important here, so your ourput map of 
> species incidence (don't call it a heatmap, it isn't measuring heat) depends 
> crucially on steps in the workflow (for example, what is the error involved 
> in delineating the range polygons?).
>
> I was mislead by your phrasing to think that you also needed the areas of 
> each species x species overlap (counts are a bit trivial), and you haven't 
> said what you mean by area (is it a measure, a polygon, or a raster cell?). 
> What does: "number o[f] species sugested to be in an area" mean? Do you mean 
> a raster cell? Or the polygonal intersection output for each unique count? Or 
> what?
>
> gOverlaps(..., byid=TRUE) should give counts (it returns a matrix, and the 
> rows/columns should be the counts if the polygons are species). However, if 
> you have many polygons, do use the STR tree approach only to search for 
> overlaps among polygons that actually have intersecting bounding boxes. You 
> can count the number of overlaps in this way. If someone can provide random 
> overlapping polygons in a SpatialPolygons object, it would be easier to show. 
> All R lists do ask for reproducible cases with built-in or simulated data.

A first cut at an example:

library(sp)
library(rgeos)
box <- readWKT("POLYGON((0 0, 0 1000, 1000 1000, 1000 0, 0 0))")
plot(box)
set.seed(1)
pts <- spsample(box, n=2000, type="random")
pols <- gBuffer(pts, byid=TRUE, width=50)
plot(pols, add=TRUE)
STR <- gBinarySTRtreeQuery(pols, pols)
length(STR)
summary(sapply(STR, length))
res <- vector(mode="list", length=length(STR))
for(i in seq(along=STR)) res[[i]] <- gOverlaps(pols[i], pols[STR[[i]]],
   byid=TRUE)
summary(sapply(res, sum)-1) # to remove self-counting, i overlaps i

So then we have # overlap counts per input polygon - but what are the 
output reporting units? Should we actually be counting the number of 
polygons that a fine grid of points over box belong to? If we focus on 
this, then maybe:

GT <- GridTopology(c(0.5, 0.5), c(1, 1), c(1000, 1000))
SG <- SpatialGrid(GT)
o <- over(SG, pols, returnList=TRUE)
ct <- sapply(o, length)
summary(ct)
SGDF <- SpatialGridDataFrame(SG, data=data.frame(ct=ct))
spplot(SGDF, "ct", col.regions=bpy.colors(20))

is closer? I haven't checked why the overlaps and the grid over counts 
differ - homework for someone? How does the output resolution affect the 
detected number of hits?

Roger

>
> Roger
>
>> 
>> Alice
>> 
>> 
>> 
>> Alice C. HughesAssociate ProfessorCentre for Integrative 
>> Conservation,Xishuangbanna Tropical Botanical Garden,Chinese Academy of 
>> SciencesMenglun, Mengla, Yunnan 666303, P.R. ChinaPh: 
>> 15198676559achughes at xtbg.ac.cn
>> 
>>> Date: Tue, 11 Mar 2014 20:56:02 +0100
>>> From: Roger.Bivand at nhh.no
>>> To: dr_achughes at hotmail.co.uk
>>> CC: r-sig-geo at stat.math.ethz.ch
>>> Subject: Re: [R-sig-Geo] overlapping polygons in shapefile
>>> 
>>> On Tue, 11 Mar 2014, Alice C. Hughes wrote:
>>> 
>>>> Hi All
>>>> This should be simple-but ArcGis keeps getting stuck at 22%...I have a
>>>> large shapefile, which contains multiple overlapping polygons for
>>>> various species-and I would like to count for any area how many polygons
>>>> overlap.A script would be really fantastic!Thanks in advance
>>> 
>>> Wouldn't it just? However, as usual, such scripts will only cause grief,
>>> as they will not (necessarily) match your workflow. Most likely, you
>>> have two choices, one to rasterise thr polygons separately (not
>>> together, because they overlap), then count overlapping cells.
>>> 
>>> If the shapefile is "large", like tens of thousands of polygons, you could
>>> try the rgeos package, first using gUnarySTRtreeQuery() or
>>> gBinarySTRtreeQuery() to identify possible overlapping candidates (these
>>> functions report intersecting bounding boxes of polygons).
>>> 
>>> Next try gOverlaps() subsetting the input objects by the output of the
>>> intersecting bounding boxes. Once you've found the overlaps, take their
>>> intersections with gIntersection(), and get the area with gArea(). Note
>>> that this only applies to planar, projected polygons. If they are in
>>> geographical coordinates, project before using gArea(). Look carefully at
>>> the byid= arguments in gOverlaps, gIntersection and gArea. Depending on
>>> the complexity of the setting, you may be using for loops and subsetting
>>> by indices - there are examples in Ch. 5 of ASDAR (2nd edition) of this
>>> kind of subsetting (or see the online code at www.asdar-book.org), about
>>> chunks 41-3 in http://www.asdar-book.org/book2ed/cm2_mod.R.
>>> 
>>> Hope this helps,
>>> 
>>> Roger
>>> 
>>>> Alice
>>>> 
>>>> Alice C. HughesAssociate ProfessorCentre for Integrative
>>>> Conservation,Xishuangbanna Tropical Botanical Garden,Chinese Academy of
>>>> SciencesMenglun, Mengla, Yunnan 666303, P.R. ChinaPh:
>>>> 15198676559achughes at xtbg.ac.cn
>>>> 
>>>> Date: Mon, 6 May 2013 14:33:29 +1000
>>>> Subject: Re: [Trop-R] Kernel Density Estimation Bootstrap
>>>> From: lhodgson86 at gmail.com
>>>> To: tropical-r at googlegroups.com
>>>> 
>>>> Hi Christian,
>>>> This formatting shows that the data has been read in as a single column 
>>>> with 'V1' as the column name and 'x,y' as the value of row 1, and so on:
>>>> 
>>>>> test<-read.table("test.csv")> test
>>>>                V11              x,y
>>>> 2   648501,89104003   641686,8917264
>>>> 4   645984,89143285   647058,8913794
>>>> 6   649659,89129257   649613,8912828
>>>> .....
>>>> 
>>>> You can check how many columns there are in a data frame using 'ncol':
>>>> 
>>>> ncol(test)
>>>> To troubleshoot, I would first try using read.csv:
>>>> 
>>>> test=read.csv(test,as.is=TRUE)ncol(test)
>>>> 
>>>> If this does not give you two columns, then I would check that the file 
>>>> really is in standard .csv format.
>>>> 
>>>> Lauren
>>>> 
>>>> On Mon, May 6, 2013 at 1:17 PM, Christian Gredzens 
>>>> <christian.gredzens at my.jcu.edu.au> wrote:
>>>> 
>>>> Leila,
>>>> 
>>>> Thanks for your reply. I'm using likelihood cross validation (CVh) 
>>>> instead of least squares cross validation (LSCV). To my knowledge 
>>>> adehabitatHR doesn't run CVh so I think I will need to write a script or 
>>>> find another program which can calculate CVh.
>>>> 
>>>> 
>>>> I used GME from spatialecology.com to calculate all of my KDEs, but I 
>>>> don't think GME runs bootstrapping. So I'm trying to run it through the 
>>>> bootstrapping package in R. Here is what I need to do:
>>>> 
>>>> 
>>>> 1) Create my spatial points layer
>>>> =mydata
>>>> 
>>>> 2) Run the bootstrapping script:
>>>> 
>>>>> mydataboot<-boot(mydata, (KDE script), R=1000)
>>>> 
>>>> mydata= my spatial points file
>>>> (KDE script)= what formula to run the bootstrap on
>>>> 
>>>> R=1000 = the number of iterations I would like to run
>>>> 
>>>> The issue I'm having is getting the bootstrapping script to run the KDE 
>>>> formula=(KDE script)
>>>> 
>>>> Cheers,
>>>> Christian
>>>> 
>>>> On Friday, May 3, 2013 12:58:55 AM UTC-7, Leila Brook wrote:
>>>> Hi Christian,
>>>> 
>>>> There is a package called adehabitatHR that does kernel density 
>>>> estimation with LSCV bandwidth using a function called kernelUD(). I 
>>>> don't know if this will is similar to what you want? You will need to 
>>>> have the data in UTM and in SpatialPoints or SpatialPointsDataFrame 
>>>> format, which you can do using the sp package.
>>>> 
>>>> 
>>>> 
>>>> There is a great vignette for adehabitatHR that will help with data 
>>>> formatting and running the KDE. I  exported the utilisation distribution 
>>>> for ArcGIS using writeRaster() in the raster package, but you can also 
>>>> export as contours.
>>>> 
>>>> 
>>>> 
>>>> Cheers
>>>> 
>>>> Leila
>>>> 
>>>> 
>>>> 
>>>> ________________________________________
>>>> 
>>>> From: tropi... at googlegroups.com [tropi... at googlegroups.com] on behalf of 
>>>> Stewart Macdonald [stewartm... at gmail.com]
>>>> 
>>>> Sent: Friday, 3 May 2013 5:12 PM
>>>> 
>>>> To: tropi... at googlegroups.com
>>>> 
>>>> Subject: Re: [Trop-R] Kernel Density Estimation Bootstrap
>>>> 
>>>> 
>>>> 
>>>> Hi Christian,
>>>> 
>>>> 
>>>> 
>>>> What sort of data are you trying to import? A series of lat/longs that 
>>>> are in a text (or Excel) file? If so, the rGDAL library can be used to 
>>>> create spatial points:
>>>> 
>>>> 
>>>> 
>>>> ###############
>>>> 
>>>> library(rgdal)
>>>> 
>>>> # if you don't already have this installed, you can do: 
>>>> install.packages('rgdal')
>>>> 
>>>> 
>>>> 
>>>> # read in the data from a tab-delimited text file that has a header line
>>>> 
>>>> rawPoints <- read.table('/data.csv', header=TRUE, sep='\t')
>>>> 
>>>> 
>>>> 
>>>> # format as a SpatialPoint class and define the projection/datum (WGS84)
>>>> 
>>>> q <- cbind(rawPoints$recordLon, rawPoints$recordLat)
>>>> 
>>>> colnames(q) <- c("x", "y")
>>>> 
>>>> qqq <- SpatialPoints(q, proj4string=CRS("+init=epsg:4326"))
>>>> 
>>>> ###############
>>>> 
>>>> 
>>>> 
>>>> Note: the above is a copied-and-pasted fragment from a working script of 
>>>> mine. I assume this fragment will work for you.
>>>> 
>>>> 
>>>> 
>>>> I can't help you with your second question, and the answer to your third 
>>>> question will depend on the format of the bootstrap data (e.g., more 
>>>> points, polygons, raster).
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Hope this helps,
>>>> 
>>>> 
>>>> 
>>>> Stewart
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 03/05/2013, at 4:57 PM, Christian Gredzens 
>>>> <christian... at my.jcu.edu.au> wrote:
>>>> 
>>>> 
>>>> 
>>>>> Hi All,
>>>> 
>>>>> 
>>>> 
>>>>> I would like to run a bootstrap within R on a spatial dataset using a 
>>>>> kernel density estimation function. Can anyone help me with this? I am 
>>>>> new to R and thus have very little experience. It seems relatively easy 
>>>>> to run the actual bootstrap but I am having problems in the following 
>>>>> areas:
>>>> 
>>>>> 
>>>> 
>>>>> 1) I don't know how to import spatial data into R and assign it an 
>>>>> appropriate projection
>>>> 
>>>>> 2) I need to setup a kernel function using a specific bandwidth 
>>>>> (likelihood cross-validation (CVh)) and create the script so it will run 
>>>>> within the bootstrap script
>>>> 
>>>>> 3) I need to export the bootstrap data as an ArcGIS 10 file.
>>>> 
>>>>> 
>>>> 
>>>>> Best Regards,
>>>> 
>>>>> Christian Gredzens
>>>> 
>>>>> 
>>>> 
>>>>> Christian Gredzens
>>>> 
>>>>> MSc Student
>>>> 
>>>>> School of Marine and Tropical Biology
>>>> 
>>>>> James Cook University
>>>> 
>>>>> Townsville, QLD, Australia 4811
>>>> 
>>>>> Christian... at my.jcu.edu.au
>>>> 
>>>>> Phone: 04 3875 3652
>>>> 
>>>>> 
>>>> 
>>>>> --
>>>> 
>>>>> An R group for questions, tips and tricks relevant to spatial ecology 
>>>>> and climate change.
>>>> 
>>>>> All R questions welcome.
>>>> 
>>>>> ---
>>>> 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Tropical R" group.
>>>> 
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to tropical-r+... at googlegroups.com.
>>>> 
>>>>> To post to this group, send an email to tropi... at googlegroups.com.
>>>> 
>>>>> To view this discussion on the web, visit 
>>>>> https://groups.google.com/d/msg/tropical-r/-/DgXQECjrCaoJ.
>>>> 
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>> 
>>>>> 
>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> An R group for questions, tips and tricks relevant to spatial ecology and 
>>>> climate change.
>>>> 
>>>> All R questions welcome.
>>>> 
>>>> ---
>>>> 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "Tropical R" group.
>>>> 
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to tropical-r+... at googlegroups.com.
>>>> 
>>>> To post to this group, send an email to tropi... at googlegroups.com.
>>>> 
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Roger Bivand
>>> Department of Economics, Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>>> e-mail: Roger.Bivand at nhh.no
>>> 
>> 
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list