[R-sig-Geo] Count occurrences less memory expensive than superimpose function in several spatial objects

ASANTOS @|ex@ndre@@nto@br @end|ng |rom y@hoo@com@br
Wed Aug 19 22:21:31 CEST 2020


Dear r-sig-geo Members,

 ??? I'll like to read several shapefiles, count occurrences in the same 
coordinate and create a final shapefile with a threshold number of 
occurrences. I try to convert the shapefiles in ppp object (because I 
have some part of my data set in shapefile and another in ppp objects) 
and applied superimpose function without success. In my synthetic example :

#Packages
library(spatstat)
library(dplyr)
library(sp)
library(rgdal)
library(raster)


#Point process example
data(ants)
ants.df<-as.data.frame(ants) #Convert to data frame

# Sample 75% in original dataset, repeat this 9 times and create a 
shapefile in each loop

for(i in 1:9){
s.ants.df<-sample_frac(ants.df, 0.75)
s.ants<-ppp(x=s.ants.df[,1],y=s.ants.df[,2],window=ants$window)#Create 
new ppp object
sample.pts<-cbind(s.ants$x,s.ants$y)
pts.sampling = SpatialPoints(sample.pts)
UTMcoor.df <- SpatialPointsDataFrame(pts.sampling, 
data.frame(id=1:length(pts.sampling)))
writeOGR(UTMcoor.df, ".",paste0('sample.shape',i), driver="ESRI 
Shapefile",overwrite=TRUE)
}

#Read all the 9 shapefiles created
all_shape <- list.files(pattern="\\.shp$", full.names=TRUE)
all_shape_list <- lapply(all_shape, shapefile)

#Convert shapefile to ppp statstat
target <- vector("list", length(all_shape_list))
for(i in 1:length(all_shape_list)){
target[[i]] <- ppp(x=all_shape_list[[i]]@coords[,1],
y=all_shape_list[[i]]@coords[,2],window=ants$window)}

#Join all ppp objects using multiplicity
target_sub<-do.call(superimpose,target)
res<-multiplicity(target_sub)

#Occurrences in the same coordinate > 5
res.xy<-as.data.frame(target_sub$x,target_sub$y,res)
res_F<-res.xy[res.xy$res>5,]

#Final shapefile
final.pts<-cbind(res_F[,1],res_F[,2])
pts.final = SpatialPoints(final.pts)
UTMcoor.df <- SpatialPointsDataFrame(pts.final, 
data.frame(id=1:length(pts.final)))
UTMcoor.df2 <-remove.duplicates(UTMcoor.df)
writeOGR(UTMcoor.df2, ".", paste0('final.ants'), driver="ESRI 
Shapefile",overwrite=TRUE)


This approach works very well in this synthetic example!!! But in my 
real data set a have the 99 shapefiles with 10^7 coordinates and when I 
try to use the do.call(superimpose,target) function my 32GB RAM memory 
crashed.

Please any ideas for how I can create a new shapefile with a criteria 
occurrences exposed but less memory expensive than superimpose all the 
objects created?

Thanks in advanced,
Alexandre

-- 
Alexandre dos Santos
Geotechnologies and Spatial Statistics applied to Forest Entomology
Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
Caixa Postal 244 (PO Box)
Avenida dos Ramires, s/n - Vila Real
Caceres - MT - CEP 78201-380 (ZIP code)
Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
Lattes CV: http://lattes.cnpq.br/1360403201088680
OrcID: orcid.org/0000-0001-8232-6722
ResearchGate: www.researchgate.net/profile/Alexandre_Santos10
Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/
--


	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list