# AW: AW: [R] numericDeriv and ecdf

Khamenia, Valery V.Khamenia at BioVisioN.de
Mon Apr 28 10:18:37 CEST 2003

```Dear Prof. Brian Ripley,

how do you manage to keep successfully all your
activities and answer posts in this forum!

> An empirical CDF is a step function: it does not have a
> derivative at the jump points, and has a zero
> derivative everywhere else.

of course!

Let me add few words concerning my simple motivation.

1. I need estimation for differential entropy.

2. I don't need estimation for differential entropy affected
by any smoothing kernels or other hypothesizes implicitly
coming as explained below.

3. Formula for differential entropy based on density

4. Density estimations based on real data are possible with
smoothing kernels only.

5. Application of smoothing kernels is not adequate if a priori
known that the family of distributions for my data is
extremely wide (Indeed, I don't need any extra hypothesizes
coming with smoothing kernels)

6. CDF is quite OK as "ascetic" estimation of distribution, i.e.
CDF adds _nothing_ to (and removes _nothing_ from) the
hypothesizes about data distribution -- unlike those density
estimations based on smoothing kernels.

7. I don't know formula for differential entropy estimation
based on CDF.

8. Therefore I should try estimate differential entropy relying
on density-based approach.

9. Histogram is quite natural way for estimation of density.

10. Classical histograms are not adequate if a priori known
that family of distribution for my data is extremely wide.
Indeed:

a) one should have some assumptions concerning the distribution
in order to have reasonable breaks for binning.

b) any binning reasonable in terms of histogram properties
tends to destroy knowledge about the distribution _within_
a bin -- only a trivial histogram with breaks situated next
to the data points is really acceptable for keeping
knowledge about the distribution like ECDF does.

11. So we come to "empirical density", which is rather uncommon
term today. In order to feel my thoughts try please:

x <- sort( rnorm(10000) )
dx <- diff(x)
ed <- 1/10000/dx
plot(x[-1], ed, log="y") # my "empirical density"
lines(x,dnorm(x),col=2)

Now I could have estimation for differential entropy like this:

-sum(ed*log(ed)*dx)

That's it.

> What is this function `numericDerivative': do you mean `numericDeriv'?

yes. Sorry, there is no auto-completion function in my non-emacs
email client as in emacs' ESS environment ;-)

kind regards,
Valery A.Khamenya
---------------------------------------------------------------------------
Bioinformatics Department
BioVisioN AG, Hannover

```