[R-sig-Geo] Calculate shortest distance between points belonging to different polygons

Tue Feb 17 13:31:51 CET 2015

Dear Thierry,

Thanks for your prompt reply!
I must admit I'm not familiar with ddply(), I'd need further advice from 
you.

 > randp.ppp<-as.ppp(randp.df)
 > randp.hs.ppp
marked planar point pattern: 100 points
Mark variables: Point_ID, Polygon_ ID
window: rectangle = [650602.2, 766311.2] x [7503238, 7607430] units

 > distance<-as.data.frame(nndist(randp.ppp, by = randp.ppp$Polygon_ID))
 > distance
     nndist(randp.ppp, by = randp.hs.ppp$Polygon_ID)
1                                   2579.42199
2                                   1391.88915
3                                     59.85628
...
...
98                                   955.26483
99                                  3166.00894
100                                  705.25663

 > distance$Origin<-as.factor(randp.df$Polygon_ID)
 > distance$Origin
   [1] 13 6  12 5  5  12 12 10 8  3  5  13 3  3  10 3  5  3  13 3  2 3  
12 6  5  13 3  2  3  3  9  1  3  4  12 6  12 12 10 2  13 3  6  3 3  6  
3  9  1
  ...
  [99] 12 12
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13

 > ddply(distance, "Origin", function(x){
+   ignore.vars<-c(levels(x$Origin)[x$Origin[1]], "Origin")
+   x<-x[,!colnames(x) %in% ignore.vars]
+   data.frame(
+     Distance = apply(x, 1, min),
+     Target = colnames(x)[apply(x, 1, which.min)]
+   )
+ })
Error: dim(X) must have a positive length

Sorry but I can't spot the problem.

Sincerely,

Ivan

On 17-Feb-15 13:21, Thierry Onkelinx wrote:
> Here is an example using ddply() to do the aggregation
>
> library(spatstat)
> library(plyr)
> n <- 100
> set.seed(123)
> point <- matrix(runif(2 * n), ncol = 2)
> colnames(point) <- c("X", "Y")
> point <- data.frame(point, Polygon = factor(LETTERS[kmeans(point, 
> 13)$cluster]))
> pattern <- as.ppp(point, W = owin(0:1, 0:1))
>
> distance <- as.data.frame(nndist(pattern, by = pattern$marks))
> distance$Origin <- point$Polygon
> ddply(distance, "Origin", function(x){
>   ignore.vars <- c(levels(x$Origin)[x$Origin[1]], "Origin")
>   x <- x[, !colnames(x) %in% ignore.vars]
>   data.frame(
>     Distance = apply(x, 1, min),
>     Target = colnames(x)[apply(x, 1, which.min)]
>   )
> })
>
> Best regards,
>
> Thierry
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature 
> and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no 
> more than asking him to perform a post-mortem examination: he may be 
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does 
> not ensure that a reasonable answer can be extracted from a given body 
> of data. ~ John Tukey
>
> 2015-02-17 11:35 GMT+01:00 Ivan Palmegiani <pan.sapiens.it at gmail.com 
> <mailto:pan.sapiens.it at gmail.com>>:
>
>     Dear members of the list,
>
>     I'm handling a SpatialPointsDataFrame with 100 ramdom points
>     distributed within 13 different polygons.
>
>     > randp
>                         coordinates   Point_ID   Polygon_ ID
>     0  (690926.8, 7522595)         1_hs                    13
>     1  (696727.1, 7576122)         2_hs                      6
>     ...
>     ...
>     98 (728199.9, 7549810)     99_hs                    12
>     99 (723428.1, 7545891) 100 <tel:7545891%29%20%20%20100>_hs       
>                 12
>
>     I need to calculate the shortest distance between points belonging
>     to different polygons. Basically I'd like to do what nndist
>     {spatstat} does. The difference is that the distance should be
>     calculated between groups of points instead of within a group of
>     points.
>
>     I tried to use "aggregate" as suggested below but it didn't work
>     out for me.
>     http://www.inside-r.org/packages/cran/spatstat/docs/nndist
>
>     Please find my try below:
>
>     > randp.df<-data.frame(randp)
>     > randp.hs.df
>            Point_ID     coords.x1      coords.x2   Polygon_ ID
>     0            1_hs     690926.8       7522595 13
>     1            2_hs     696727.1       7576122 6
>     2            3_hs     723480.7       7546594 12
>
>     library(spatstat)
>
>     # Calculate nearest neighbors within a polygon
>     >
>     nn.within.pol<-nndist(randp.df[,c(2,3)],by=marks(randp.df$Polygon_ID))
>     > nn.within.pol
>      [1]  2579.42199  1391.88915    59.85628   734.95108  734.95108
>     840.65125   957.47838   741.58160   955.26483 3307.59444
>     1361.64626  2682.70690
>      ...
>      ...
>      [97]  1349.88694   955.26483  3166.00894   705.25663
>     # Ok but these are not the distances I need
>
>     # Calculate nearest neighbors between polygons
>     nn.between.pol<-aggregate(nn.within.pol,
>     by=list(from=marks(randp.df$Polygon_ID)), min)
>     # Error in aggregate.data.frame(as.data.frame(x), ...) : arguments
>     must have same length
>
>     > nn.between.hs<-aggregate(randp.hs.df[,c(2,3)],
>     by=list(randp.df$Polygon_ID), nndist)
>     > nn.between.hs
>
>     The outcome is an asymmetric data frame (dim 13, 6) with a lot of
>     empty cells and values that look unlikely to be distances.
>
>     The result I'd like to get is a matrix (dim 100, 1) with the
>     distances between each random point and its nearest neighbor
>     belonging to a different polygon (i.e. its nearest neighbor having
>     a different Polygon_ID).
>
>     Can someone kindly correct my script or suggest a function able to
>     do the job?
>
>     Cheers,
>
>     Ivan
>
>     _______________________________________________
>     R-sig-Geo mailing list
>     R-sig-Geo at r-project.org <mailto:R-sig-Geo at r-project.org>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>

	[[alternative HTML version deleted]]