[R] mutlidimensional in.convex.hull (was multidimensional point.in.polygon??)

Charles C. Berry cberry at tajo.ucsd.edu
Thu Dec 10 23:51:14 CET 2009


On Thu, 10 Dec 2009, Keith Jewell wrote:

> Hi All (especially Duncan and Baptiste),
>
> Summary (of lengthy bits below):
> I will have a convex hull in multiple (3 to 10) dimensions derived from many
> (thousands) of points by geometry::convhulln.
> I will need to categorise other 'test' points as inside/outside that convex
> hull . e.g. given:
> ----------------------
> require(geometry)
> ps <- matrix(rnorm(4000),ncol=4)     # 'calibration' set
> phull <-  convhulln(ps)                      # convex hull
> xs <- matrix(rnorm(1200),ncol=4)    # 'test' set
> -----------------
> How do I categorise each point (row) in xs as inside/outside(/on) phull???
> There is tripack::in.convex.hull but that doesn't handle my dimensionality.
>
> Thanks to Duncan Murdoch for the suggestion (just a few lines down,
> previously made by Baptiste Auguie): of testing a single point thus:
>  i) add the test point to the set of points defining the convex hull,
>  ii) recalculate the convex hull
>  iii) if the test point is part of the new convex hull, then it was outside
> the original
>
> BUT I have many (thousands) of test points, so this would involve very many
> convex hull calculations. My suggestion, immediately below, requires finding
> the signs of perpendicular distances from each test point to each
> multidimensional 'plane' defining the convex hull (NB: phull  is a matrix in
> which each row defines such a 'plane').

Many?


> set.seed(1234)
> ps <- matrix(rnorm(4000),ncol=4)
> phull <-  convhulln(ps)
> xs <- matrix(rnorm(1200),ncol=4)
> phull2 <- convhulln(rbind(ps,xs))
> nrp <- nrow(ps)
> nrx <- nrow(xs)
> outside <- unique(phull2[phull2>nrp])-nrp
> done <- FALSE
> while(!done){
+     phull3 <- convhulln(rbind(ps,xs[-(outside),]))
+     also.outside <- (1:nrx)[-outside][unique(phull3[phull3>nrp])-nrp]
+     print(length(also.outside))
+     outside <- c(outside,also.outside)
+     done <- length(also.outside)==0
+ }
[1] 3
[1] 0
>

phull2 was evaluated once, phull3 twice.

Any point that is in the convex hull of rbind(ps,xs) is either in or 
outside the convex hull of ps. Right? So, just recursively eliminate 
points that are in the convex hull of the larger set.

Chuck

p.s. for

  	xs <- matrix(rnorm(120000),ncol=4)

it required about a dozen iterations

>
> Baptiste has found a Matlab implementation
> <http://www.mathworks.com/matlabcentral/fileexchange/10226-inhull> of (what
> looks like) my algorithm. I don't speak Matlab, but this looks non-trivial
> to code in R. I'll do it if I have to, but if it already exists it would be
> nice. If I do have to code it, I'd really appreciate an expression in
> algebra rather than Matlab!
>
> Any pointers will be much appreciated,
>
> Keith Jewell
> "Duncan Murdoch" <murdoch at stats.uwo.ca> wrote in message
> news:4B20E1EA.3030507 at stats.uwo.ca...
>> On 10/12/2009 5:15 AM, Keith Jewell wrote:
>>> Hi,
>>>
>>> Doing some more reading, I think the problem is easier because the hull
>>> is convex. Then an algorithm for testing points might be:
>>>
>>> a) Define the convex hull as a set of planes (simplexes).
>>>     [as returned by convhulln!!]
>>>
>>> b) Define one point, i, known to be interior
>>>     [e.g. mean of all the points defining the hull]
>>>
>>> c) If point x is
>>>     i) for every plane, on the same side as i; x is interior
>>>    ii) for every plane, on the same side as i or in the plane; x is in
>>> the surface
>>>  iii) else x is exterior
>>
>> That looks like it would work, but wouldn't it be easier to do the
>> following:
>>
>> Compute the convex hull with the new point added. If the point is
>> exterior, the new point will be part of the hull.  If interior, it won't
>> be.  If it is on the boundary, it's probably unpredictable, but due to
>> rounding error, that's probably true even with a perfect algorithm.
>>
>> I didn't notice that you said how your original polygon is defined, but if
>> it is defined as a convex hull or in terms of its vertices, the above
>> method would work.  If it's defined some other way, it might be hard.
>>
>> Duncan Murdoch
>>
>>
>>>
>>> So now I need to find the directions of points from multidimensional
>>> planes.Perhaps I can find the vectors of the perpendiculars from the
>>> points to the planes (possibly extended) and test for
>>> parallel/anti-parallel?
>>>
>>> I feel that I'm in the right direction because this uses the structure of
>>> a convex hull returned by convhulln. But, I still feel I'm re-inventing
>>> the wheel. Surely this has been done before? Isn't a (the?) major purpose
>>> of a convex hull to test other points for inclusion?
>>>
>>> Perhaps when I get the geometry sorted this will be so easy I'll
>>> understand why noone has pointed me to an existing R function, but
>>> currently I feel I and Baptiste are wandering in the dark :-(
>>>
>>> Any hints?
>>>
>>> Thanks in advance,
>>>
>>> Keith Jewell
>>> -----------------------------------------------------------------
>>> "baptiste auguie" <baptiste.auguie at googlemail.com> wrote in message
>>> news:de4e29f50912040550m71fbffafnfa1ed6e0f4451604 at mail.gmail.com...
>>> Hi,
>>>
>>> Yet another one of my very naive ideas on the subject: maybe you can
>>> first evaluate the circumscribed and inscribed spheres of the base set
>>> of points (maximum and minimum of their distances to the center of
>>> gravity). Any points within a distance smaller than the infimum is
>>> good, any point further than the supremum is not good. This should be
>>> faster than the calculation of a convex hull for each point. Of course
>>> the usefulness of this first test really depends on how aspherical is
>>> your base convex hull.
>>>
>>> I do hope to read a real answer from someone who knows this stuff!
>>>
>>> HTH,
>>>
>>> baptiste
>>>
>>>
>>> 2009/12/4 Keith Jewell <k.jewell at campden.co.uk>:
>>>> Hi,
>>>>
>>>> I seek to identify those points in/outside a multidimensional convex
>>>> hull
>>>> (geometry::convhulln). Any suggestions?
>>>>
>>>> Background just in case I'm going down a really wrong road:
>>>>
>>>> Given an observed data set with one dependent/observed variable (Y) and
>>>> multiple (3 to 10) independent/design variables (X1, X2, ...) I want to
>>>> increase the number of points by interpolating. I'm using expand.grid(X)
>>>> to
>>>> generate the X points and locfit::predict.locfit to interpolate the Y
>>>> values. No problem so far.
>>>>
>>>> BUT my observed X data set is very far from rectangular, so the set of
>>>> points produced by expand.grid includes many "extrapolations" which I'd
>>>> want to remove. I'm aware of the problems of defining extrapolation in
>>>> multiple dimensions and am minded to define it as "outside the convex
>>>> hull",
>>>> hence my question.
>>>>
>>>> In searching r-project.org I came across a thread in which baptiste
>>>> auguie
>>>> said "one general way to do this would be to compute the convex hull
>>>> (?chull) of the augmented set of points and test if the point belongs to
>>>> it"; an approach I'd considered generalising to multiple points thus
>>>> (pseudo
>>>> R code)...
>>>> ----------------
>>>> baseHull <- convhulln(baseSet)
>>>> augHull <- convhulln(augSet)
>>>> while (augHull != baseHull)
>>>> { augSet <- augSet [-(augHull !%in% baseHull)]
>>>> augHull <- convhulln(augSet)
>>>> }
>>>> --------------------
>>>> ... but this has that horrible loop including a convhulln!
>>>>
>>>> In the (real, typical) test data set I'm using for development baseSet
>>>> is 5
>>>> columns by 2637 rows while baseHull has only 42 distinct points. If
>>>> augHull
>>>> has a similarly simple hull, then each time round the loop will remove
>>>> only
>>>> a few dozen points; I need to remove many thousands.
>>>>
>>>> And (in my naivete) I think there must be a better way of testing for a
>>>> point inside a polygon, I'd have thought convhulln must need to do that
>>>> often!
>>>>
>>>> Thanks in advance for any pointers,
>>>>
>>>> Keith Jewell
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




More information about the R-help mailing list