FW: [R] distance between distributions

Fri May 6 21:04:13 CEST 2005

Sorry, forgot to send this to the list originally.

-----Original Message-----
From: Mike Waters [mailto:dr.mike at ntlworld.com] 
Sent: 06 May 2005 18:40
To: 'Campbell'
Subject: RE: [R] distance between distributions

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Campbell
Sent: 06 May 2005 11:19
To: vograno; r-help
Subject: Re: [R] distance between distributions

I may have missed the point here but isn't this an obvious case for using
the bootstrap.  A paper by Mallows, the exact reference escapes me,
establishes the conditions under which asymtotics of the marginal
distribution imply a well behaved limit.  Perhaps a better discussion of the
issues can be found and a pair of papers by Bickel and Freedman, see Annals
Of Statistics Vol 9, Number 6.

HTH
Phineas Campbell

>>> Vadim Ogranovich <vograno at evafunds.com> 05/06/05 1:36 AM >>>
Hi,

This is more of a general stat question. I am looking for a easily
computable measure of a distance between two empirical distributions.
Say I have two samples x and y drawn from X and Y. I want to compute a
statistics rho(x,y) which is zero if X = Y and grows as X and Y become less
similar.

Kullback-Leibler distance is the most "official" choice, however it needs
estimation of the density. The estimation of the density requires one to
choose a family of the distributions to fit from or to use some sort of
non-parametric estimation. I have no intuition whether the resulting KL
distance will be sensitive to the choice of the family of the distribution
or of the fitting method.

Any suggestion of an alternative measure or insight into sensitivity of the
KL distance will be highly appreciated.

The distributions I deal with are those of stock returns and qualitatively
close to the normal dist with much fatter tails. The tails in general should
be modeled non-parametrically.

Thanks,
Vadim

	[[alternative HTML version deleted]]

____________________________________________________________________________
_____________

Was this the reference you were thinking of?

C. L. Mallows. A note on asymptotic joint normality. Annals of Mathematical
Statistics,43(2):508-515, 1972

Another reference that might be of relevance is:

Bootstrap Methods for the Nonparametric Assessment of Population
Bioequivelance and Similarity of Distributions. Czado C. and Munk A. ; this
can be obtained as a postscript document from:
http://www-m4.ma.tum.de/Papers/Czado/simrev.ps

HTH

Mike

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html