# [R] How to do a probability density based filtering in 2D?

Emmanuel Levy emmanuel.levy at gmail.com
Sat Nov 20 04:32:26 CET 2010

```Hello Roger,

Thanks for the suggestions.

I finally managed to do it using the output of kde2d - The code is
pasted below. Actually this made me realize that the outcome of kde2d
can be quite influenced by outliers if a boundary box is not given
(try running the code without the boundary box, e.g.lims, and you will
see).

library(MASS)
x1 = rnorm(10000)
x2 = rnorm(10000)
nBins=200

z1 = kde2d(x1,x2,n=nBins, lims=c(-4,4,-4,4))

x1.cut = cut(x1, seq(z1\$x[1], z1\$x[nBins],len=(nBins+1) ))
x2.cut = cut(x2, seq(z1\$y[1], z1\$y[nBins],len=(nBins+1) ))
xy.cuts = data.frame(x1.cut,x2.cut, ord=1:(length(x1.cut)) )

density = data.frame( X=rep(factor(levels(x1.cut)),rep(nBins,nBins) ),
Y=rep(factor(levels(x2.cut)),nBins) , Z= as.vector(z1\$z))

xy.density = merge( xy.cuts, density, by=c(1,2), sort=FALSE, all.x=TRUE)
xy.density = xy.density[order(x=xy.density\$ord),]
xy.density\$Z[is.na(xy.density\$Z)]=0

Z.lim = quantile(xy.density\$Z, prob=c(0.1))
to.plot = xy.density\$Z > Z.lim[1]

par(mfrow=c(1,2))
plot(x1,x2, xlim=c(-3,3) ,ylim=c(-3,3))
plot(x1[to.plot], x2[to.plot], xlim=c(-3,3),ylim=c(-3,3))

On 19 November 2010 21:52, Roger Koenker <rkoenker at illinois.edu> wrote:
>
>> Also I wouldn't be surprised if there is a trick with quantile that
>> escapes my mind.
>
> I would be surprised...  this is all very dependent on how you do the
> density estimation, but you might look at contourLines and then try
> to use in.convex.hull() from tripack, but beware that things get more
> complicated if the density is multi-modal.
>
> You might also look at one of the packages that do "data depth"
> sorts of things.
>
> Roger Koenker
> rkoenker at illinois.edu
>
>
>
>
>
>

```