[R-sig-Geo] use quadtree idea to create densified grids containing maximum number points

Zhou, Yuhong yuzhou at mcw.edu
Fri Sep 26 02:51:51 CEST 2014

The goal is to divide a bounding square into nonuniform grids, constraining
that each grid does not contain maximum number points.

The origins (lower-left corner) and the side length of the square are given
to define the bounding square. A point threshold, for example, a value of 50
is chosen. The points shapefile is available, from which the projection can
be determined. The points shapefile has a variable called Obs, describing
how many observations are attached to that point.

The quadtree idea is employed to do this. If the number of original data
points/observations in the bounding square is larger than the size of the
threshold, we split the study space into four separate, equal-area
rectangles. Then the number of points/observations within each rectangle is
counted. If the value exceeds the threshold, that rectangle is further
subdivided. This is repeated for all rectangles, at all levels of
subdivision, until no rectangle has a number of original data points in
excess of the threshold.

I have written a function to try to achieve this. The current code run
recursively, but never ends the recursion.  Either something is wrong with
the return(), or with the structure of the function. I also want to record
the XY coordinates of grids that are leaves, and trace their
grid-parents/grandparents/...., but not know how to get it coded correctly.

Here is the code:
## origin, delta -> bounding square
## points (shapefile)
## threshold
##  xy_df to record the XY of child/leaf grids
## id - indexing the classes of leaf/parent

quadtree <- function(origin, delta, threshold=50, points, strCRS, id=1, xy_df) {
  d = 4
  ## Get the cellcenter and cell size
  cs <- c(delta/2, delta/2) ## cell size
  cd <- c(2,2) ## same for quadtree division
  cc <- origin + (cs/2) ##cellcenter

  ## create the grid structure and spatial grid data frame
  grd <- GridTopology(cellcentre.offset=cc, cellsize=cs, cells.dim=cd)
  sp_grd_all <- SpatialGridDataFrame(grd,data=data.frame(id=1:prod(cd)),proj4string=strCRS)

  ## overlay with the points to get the sum of the attribute values (e.g. #observations) of the points
  newgrd <-aggregate(pointshp[names(pointshp) == "Obs"], sp_grd_all, FUN =sum)
  sp_grd_all[["Obs"]] <- newgrd at data

  ## Loop each grid, if the number of observations within the grid does not exceed the threshold value (maximum number of observations a grid can
accommadate), add the XY coordinates of the grid to a dataframe; otherwise, split/divide the grid into four quadrants (recursively call the function quadtree);
  for (j in 1:4) {
    if ((sp_grd_all$Obs[j,1] <=k)) {
      x = coordinates(sp_grd_all)[j,1]
      y = coordinates(sp_grd_all)[j,2]
      newdf <-data.frame(xcoor=x, ycoor=y)
      xy_df <- rbind(xy_df, newdf)

      #rv = list(id=id+j, xcoor=x, ycoor=y)  ## i also want to create a class for the child grid; but not sure whether the coding is correct
      #class(rv) <- "quadtree.leaf"

      co1 <- coordinates(sp_grd_all)[j,] - (cs/2) ## calculate the origins (the lower-left corner) of each sub-grid
      len1 <- len/2  ## calculate the total length of the sub-grid
      quadtree(co1, len1, k, points, boundary, strCRS, id+j,xy_df)  ## recursive call function to divide the grid to quadrants

      #rv <- list(index=j, qt=quadtree(co1, len1, k, points, boundary, strCRS, id+j))
      #class(rv) <- "quadtree"  ## i also want to create a class for the parent grid;
    ## return(xy_df)   ## when I included the return() here, the code stops when a leaf/child is found; when i removed it, the recursive call never ends.

Any feedbacks/suggestions/references to similar functions will be
appreciated. Thank you!


More information about the R-sig-Geo mailing list