[R] distance between distributions

Fri May 6 12:19:25 CEST 2005

I may have missed the point here but isn't this an obvious case for
using the bootstrap.  A paper by Mallows, the exact reference escapes
me, establishes the conditions under which asymtotics of the marginal
distribution imply a well behaved limit.  Perhaps a better discussion of
the issues can be found and a pair of papers by
Bickel and Freedman, see Annals Of Statistics Vol 9, Number 6.

HTH
Phineas Campbell

>>> Vadim Ogranovich <vograno at evafunds.com> 05/06/05 1:36 AM >>>
Hi,

This is more of a general stat question. I am looking for a easily
computable measure of a distance between two empirical distributions.
Say I have two samples x and y drawn from X and Y. I want to compute a
statistics rho(x,y) which is zero if X = Y and grows as X and Y become
less similar.

Kullback-Leibler distance is the most "official" choice, however it
needs estimation of the density. The estimation of the density requires
one to choose a family of the distributions to fit from or to use some
sort of non-parametric estimation. I have no intuition whether the
resulting KL distance will be sensitive to the choice of the family of
the distribution or of the fitting method.

Any suggestion of an alternative measure or insight into sensitivity of
the KL distance will be highly appreciated.

The distributions I deal with are those of stock returns and
qualitatively close to the normal dist with much fatter tails. The tails
in general should be modeled non-parametrically.

Thanks,
Vadim

	[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html