[R-sig-Geo] Summary to Query for overlapping points in R

Katrina Bennett kbennett at uvic.ca
Mon Mar 26 22:28:21 CEST 2007

This is a summary of the responses and my solution to my initial posting of
Fri, 23 Mar 2007: Query for overlapping points in R?

Thanks very much for the responses and assistance with this problem. Roger,
the 'clip' function is included in the entire version of the code, appended

Katrina Bennett

Response 1: Roger Bivand (Roger.Bivand at nhh.no)

Well, why not simply measure the distances between the points in the two
files, treating points within a small threshold as common? Use a small
threshold rather than equality to avoid numerical fuzz problems.
spDistsN1() in sp will do that iterated over points in one of the sets.
Using the two indices, you can build up your output. Please consider using
sp classes - messing around inside the deprecated Map class is not
necessarily robust.

By the way, what is the function clip() in your code below?



Response 2: Nicholas Lewin-Koh (nikko at hailmail.net)
Hi Katrina,
If your data are really only points all you need to do if you have dat1 and
dat2 as two datasets like you described is, i1<-which(dat1$lat%in%dat2$lat &
dat1$long%in%dat2$long) i2<-which(dat2$lat%in%dat1$lat &

Mind you this is not elegant, and there is probably a faster ways,
but it will get you what you need.


My Updated Code:

#Import the libraries which we will need
require (sp)
require (shapefiles)

#This is a list of variabiles to process  #removed "prec", from sentence
below to run 1 more time...
vars = c("monavg_prec","monavg_temp", "monavg_tmin", "monavg_tmax")

#This is a list of climate drivers to process
datats = c("elnino", "lanina", "ensoneutral", "pdominus", "pdoplus")

#This is the list of normals for these drivers
normalbits = c("normals_enso", "normals_enso", "normals_enso",
"normals_pdo", "normals_pdo")

#This will define the regions
region = 8

# Uses internal data in x to return only the entries in x that are within
the poly
clip <- function(x) {
  moo <- NULL
  for (point in 1:nrow(x)) {
    if (x[point,]$io > 0) {
      moo <- rbind(moo, x[point,])

shape <- read.shapefile("BCregions7g1") #open the shapefile for each region

#This loop runs over all the climate driver and variable combinations 
for (varname in vars){
  for (id in 1:5){
    variancefile <- paste(varname, "_", datats[id], ".csv", sep = "")
    normalfile <- paste(varname, "_", normalbits[id], ".csv", sep = "")
    #Import the data file(s)
    variances <- read.csv(variancefile, header = T)
    normals <- read.csv(normalfile, header = T)
    variance <- colnames(variances)
      for (i in 3:length(variance)){
          colnames(variances)[i] <- substr(variance[i],2,3)
     normal <- colnames(normals)
      for (i in 3:length(normal)){
          colnames(normals)[i] <- substr(normal[i],2,3)
    #region_variables = c()
    #Clip regions (have to examine maptools package)
    clip_variance <- clip(cbind(variances, io =
point.in.polygon(variances$Long, variances$Lat,
shape$shp$shp[[region]]$points$X, shape$shp$shp[[region]]$points$Y)))
    clip_normal <- clip(cbind(normals, io = point.in.polygon(normals$Long,
normals$Lat, shape$shp$shp[[region]]$points$X,
    i1 <- which(clip_variance$Lat%in%clip_normal$Lat &
    i2 <- which(clip_normal$Lat%in%clip_variance$Lat &
    o1 <- order(clip_variance$Lat[i1], clip_variance$Long[i1])
    o2 <- order(clip_normal$Lat[i2], clip_normal$Long[i2])
    # generate anomalies, retaining the order of the rows
    anomalies <- (clip_variance[i1,c(3:ncol(clip_variance))])[o1,] -
    geo_anom <- cbind((clip_variance[i1,c("Lat", "Long")])[o1,], anomalies)
    # Generate output files for the complete data set
    outputfile <- paste(varname, "_", datats[id], "_", region, ".csv", sep =
    write.csv(geo_anom, outputfile)

More information about the R-sig-Geo mailing list