[R] How to do a probability density based filtering in 2D?

Emmanuel Levy emmanuel.levy at gmail.com
Sat Nov 20 04:32:26 CET 2010

Hello Roger,

Thanks for the suggestions.

I finally managed to do it using the output of kde2d - The code is
pasted below. Actually this made me realize that the outcome of kde2d
can be quite influenced by outliers if a boundary box is not given
(try running the code without the boundary box, e.g.lims, and you will

x1 = rnorm(10000)
x2 = rnorm(10000)

z1 = kde2d(x1,x2,n=nBins, lims=c(-4,4,-4,4))

x1.cut = cut(x1, seq(z1$x[1], z1$x[nBins],len=(nBins+1) ))
x2.cut = cut(x2, seq(z1$y[1], z1$y[nBins],len=(nBins+1) ))
xy.cuts = data.frame(x1.cut,x2.cut, ord=1:(length(x1.cut)) )

density = data.frame( X=rep(factor(levels(x1.cut)),rep(nBins,nBins) ),
Y=rep(factor(levels(x2.cut)),nBins) , Z= as.vector(z1$z))

xy.density = merge( xy.cuts, density, by=c(1,2), sort=FALSE, all.x=TRUE)
xy.density = xy.density[order(x=xy.density$ord),]

Z.lim = quantile(xy.density$Z, prob=c(0.1))
to.plot = xy.density$Z > Z.lim[1]

plot(x1,x2, xlim=c(-3,3) ,ylim=c(-3,3))
plot(x1[to.plot], x2[to.plot], xlim=c(-3,3),ylim=c(-3,3))

On 19 November 2010 21:52, Roger Koenker <rkoenker at illinois.edu> wrote:
>> Also I wouldn't be surprised if there is a trick with quantile that
>> escapes my mind.
> I would be surprised...  this is all very dependent on how you do the
> density estimation, but you might look at contourLines and then try
> to use in.convex.hull() from tripack, but beware that things get more
> complicated if the density is multi-modal.
> You might also look at one of the packages that do "data depth"
> sorts of things.
> Roger Koenker
> rkoenker at illinois.edu

More information about the R-help mailing list