[R-sig-Geo] Count occurrences less memory expensive than superimpose function in several spatial objects

Vijay Lulla v|j@y|u||@ @end|ng |rom gm@||@com
Thu Aug 20 02:49:52 CEST 2020


Hi Alexandre,
As far as I can tell (mostly from reading the docs...no prior experience of
using multiplicity or superimpose myself) it appears that they are just
calculating the number of unique values for a combination of x,y coordinate
pairs. So, you can do this by using the group by semantics of either
tidyverse or SQL to generate the res.xy data.frame. Below is an example of
generating res.xy alternatively using data.table (I'm not as familiar with
tidyverse):

target_sub1 <- rbindlist(lapply(target, as.data.table))
res1 <- target_sub1[, .(res=.N), by=.(x,y)]
res.xy1 = res1[target_sub1, on=c("x","y")]

all.equal(res.xy, res.xy1, check.attributes=FALSE) # should return TRUE

If you're using SQL then you just join the raw table with the grouped table
and you should get the table coordinates and occurrences. And, considering
the number of coordinates you have I recommend either data.table or SQL to
generate the final output.
HTH,
Vijay.


On Wed, Aug 19, 2020 at 4:22 PM ASANTOS via R-sig-Geo <
r-sig-geo using r-project.org> wrote:

> Dear r-sig-geo Members,
>
>  ??? I'll like to read several shapefiles, count occurrences in the same
> coordinate and create a final shapefile with a threshold number of
> occurrences. I try to convert the shapefiles in ppp object (because I
> have some part of my data set in shapefile and another in ppp objects)
> and applied superimpose function without success. In my synthetic example :
>
> #Packages
> library(spatstat)
> library(dplyr)
> library(sp)
> library(rgdal)
> library(raster)
>
>
> #Point process example
> data(ants)
> ants.df<-as.data.frame(ants) #Convert to data frame
>
> # Sample 75% in original dataset, repeat this 9 times and create a
> shapefile in each loop
>
> for(i in 1:9){
> s.ants.df<-sample_frac(ants.df, 0.75)
> s.ants<-ppp(x=s.ants.df[,1],y=s.ants.df[,2],window=ants$window)#Create
> new ppp object
> sample.pts<-cbind(s.ants$x,s.ants$y)
> pts.sampling = SpatialPoints(sample.pts)
> UTMcoor.df <- SpatialPointsDataFrame(pts.sampling,
> data.frame(id=1:length(pts.sampling)))
> writeOGR(UTMcoor.df, ".",paste0('sample.shape',i), driver="ESRI
> Shapefile",overwrite=TRUE)
> }
>
> #Read all the 9 shapefiles created
> all_shape <- list.files(pattern="\\.shp$", full.names=TRUE)
> all_shape_list <- lapply(all_shape, shapefile)
>
> #Convert shapefile to ppp statstat
> target <- vector("list", length(all_shape_list))
> for(i in 1:length(all_shape_list)){
> target[[i]] <- ppp(x=all_shape_list[[i]]@coords[,1],
> y=all_shape_list[[i]]@coords[,2],window=ants$window)}
>
> #Join all ppp objects using multiplicity
> target_sub<-do.call(superimpose,target)
> res<-multiplicity(target_sub)
>
> #Occurrences in the same coordinate > 5
> res.xy<-as.data.frame(target_sub$x,target_sub$y,res)
> res_F<-res.xy[res.xy$res>5,]
>
> #Final shapefile
> final.pts<-cbind(res_F[,1],res_F[,2])
> pts.final = SpatialPoints(final.pts)
> UTMcoor.df <- SpatialPointsDataFrame(pts.final,
> data.frame(id=1:length(pts.final)))
> UTMcoor.df2 <-remove.duplicates(UTMcoor.df)
> writeOGR(UTMcoor.df2, ".", paste0('final.ants'), driver="ESRI
> Shapefile",overwrite=TRUE)
>
>
> This approach works very well in this synthetic example!!! But in my
> real data set a have the 99 shapefiles with 10^7 coordinates and when I
> try to use the do.call(superimpose,target) function my 32GB RAM memory
> crashed.
>
> Please any ideas for how I can create a new shapefile with a criteria
> occurrences exposed but less memory expensive than superimpose all the
> objects created?
>
> Thanks in advanced,
> Alexandre
>
> --
> Alexandre dos Santos
> Geotechnologies and Spatial Statistics applied to Forest Entomology
> Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
> Caixa Postal 244 (PO Box)
> Avenida dos Ramires, s/n - Vila Real
> Caceres - MT - CEP 78201-380 (ZIP code)
> Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
> Lattes CV: http://lattes.cnpq.br/1360403201088680
> OrcID: orcid.org/0000-0001-8232-6722
> ResearchGate: www.researchgate.net/profile/Alexandre_Santos10
> Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/
> --
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>


-- 
Vijay Lulla, PhD
ORCID | <https://orcid.org/0000-0002-0823-2522> Homepage
<http://vlulla.github.io> | Google Scholar
<https://scholar.google.com/citations?user=VjhJWOgAAAAJ&hl=en> | Github
<https://github.com/vlulla>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list