[R-sig-Geo] Finding overlapping polygons

Luciano La Sala lucianolasala at yahoo.com.ar
Wed Jul 26 15:28:24 CEST 2017


Dear everyone,

I am studying the potential spatal distribution between vertebrate 
invasive species and native, threatened ones in the souther cone of 
South America. The invasive species' distribution was grossly 
approximated using an ecoregion approach, which is not relevant here. 
Then, I downloaded shapefiles for all terrestrial mammals and birds of 
the world from BirdLife International 
(http://www.biodiversityinfo.org/spcdownload/r5h8a1/) and IUCN 
(http://www.iucnredlist.org/technical-documents/spatial-data) public 
domains.

Our goal: using a two-step approach, identify native species (polygons) 
whose distributionoverlap -totally or partially- with that of the invasive.

Our problem: Some species (with small distribution) which are known to 
occur within the invasive distribution range are being left out of the 
final selection (checkedin QGIS). I suspect the function "isin" in the 
fastshp package (2nd loop) fails to select some of the overlapping 
polygons in Step 2, or some other problem. If a reproducible example is 
absolutely required, I can upload to dropbox.

For the sake of simplicity, I will present present a case example 
including one invasive species (e.g. wild boar) and all terrestrial 
mammals.

# Analysis of Wild Boar distributional overlap with IUCN mammal species 
install.packages("rgdal") install.packages("sp") 
install.packages("ggplot2") install.packages("rgeos") 
install.packages("raster") install.packages("fastshp", repos = 
"http://rforge.net", type = "source")install.packages("cleangeo")

mypath <- ("/Users/ ... /Terrestrial mammals") files <- 
list.files(path=mypath, pattern = "\\.shp$", full.names=T) 
length(files)  # 5,465 mammal species # Load study area

study_area <- read.shp("C:/Users/.../Study_area.shp", format="table", 
close=FALSE) salon <- range(study_area$x, na.rm=T) salat <- 
range(study_area$y, na.rm=T) bounding <- c(salon, salat) box <- 
data.frame(salon, salat) colnames(box) <- c("Longitud","Latitude") box

# Step 1. Run all native IUCN species' distribution through a loop and 
select those which overlap with the invasive's bounding box distribution

keep <- rep(NA, length(files)) length(keep)  # 5,465 for(i in 
1:length(files)){   r <- read.shp(files[i], format="table")   rlon <- 
range(r$x)   rlat <- range(r$y) ## Is out of the bounding box? test.lat 
<- (min(rlat) > max(salat)) |     (max(rlat) < min(salat))   test.lon <- 
(min(rlon) > max(salon)) |     (max(rlon) < min(salon)) keep[i] <- 
!(test.lon | test.lat) } keep <- which(keep==1) # Index of shapefiles 
that passed the first filter length(keep) # 773 species passed 1st 
filter # Step 2. Load invasive species distribution area

study_area <- readOGR("/Users/.../Study area","Study_area") study_area 
<- spTransform(study_area, CRS("+init=epsg:32721")) # To UTM # Get a 
report of geometry validity & issues for a sp spatial object report <- 
clgeo_CollectionReport(study_area) summary <- 
clgeo_SummaryReport(report) summary # Remove holes and simplify geometry 
of study area for computational efficiency outer <- 
Filter(function(x){x at ringDir==1}, study_area at polygons[[1]]@Polygons) # 
PREGUNTAR study_area <- SpatialPolygons(list(Polygons(outer, ID=1))) 
study_area <- disaggregate(study_area) areas <- gArea(study_area, 
byid=TRUE) #Remove polygons smaller than 3000 km2 study_area <- 
study_area[which(areas > 3E9),] study_area <- gSimplify(study_area, 
20000, topologyPreserve=TRUE) # Simplify geometry 
proj4string(study_area) <- CRS("+init=epsg:32721") study_area <- 
spTransform(study_area, CRS("+proj=longlat +datum=WGS84 +no_defs 
+ellps=WGS84 +towgs84=0,0,0")) xy <- fortify(study_area) # 
SpatialPolygons to data frame keep2 <- rep(NA,length(keep)) # 773 sp. 
that passed first filter for(i in 1:length(keep)){   r <- 
read.shp(files[keep[i]], format="polygon")   isin <- any(inside(r, 
x=xy$long, y=xy$lat), na.rm=T)   keep2[i] <- isin } keep2 <- 
which(keep2) # Index of shapefiles that passed the second sort 
length(keep2)  # 532 species that passed the second filter species <- 
list.files(path=mypath, pattern = "\\.shp$",full.names=F) species <- 
sub(".shp", "", species) species <- species[keep[keep2]]



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list