[R] CDFs

David Winsemius dwinsemius at comcast.net
Tue Aug 23 01:17:52 CEST 2011


On Aug 22, 2011, at 6:26 PM, R. Michael Weylandt wrote:

>
>
> On Mon, Aug 22, 2011 at 4:57 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>
> On Aug 22, 2011, at 4:34 PM, David Winsemius wrote:
>
>
> On Aug 22, 2011, at 3:50 PM, R. Michael Weylandt wrote:
>
> Yes. The xCDF/yCDF objects that are returned by the ecdf function  
> can be
> called like functions.
>
> Because they _are_ functions.
>
> > "function" %in% class(xCDF)
> [1] TRUE
> > is.function(xCDF)
> [1] TRUE
>
> You know, I spent a good 30 seconds trying to figure out how to put  
> that, knowing that whatever I said someone would pounce on, yes it  
> is a function on one hand, but it's not only a function on the  
> other...the dangerous world of being the small fish in the semi- 
> anonymous list serve pool...point definitely taken though. What's  
> the official way to say it? "xCDF has a function class"?

"xCDF is an R function."  # would be how I would say it.

The trick here is that ecdf-objects are computed on the basis of  
another object in the workspace but the information is stored in the  
ecdf function's environment. When you wrap that object in an  
additional call to the "function()" function you may be making your  
implementation much more fragile. The function , xCDF has the knots  
stored with it in its environment. but the Fx function only has a  
reference to xCDF and no environment other than the .Global.env, so  
there isn't much left if you then remove xCDF, whereas removing x will  
leave xCDF entirely functional.

 > environment(xCDF)
<environment: 0x385705340>
 > ls(env=environment(xCDF))
[1] "f"      "method" "n"      "nobs"   "x"      "y"      "yleft"   
"yright"
 > environment(xCDF)$x
  [1] -2.363555812 -2.036395899 -1.627785957 -1.554706917 -1.541211694
  [6] -1.290059870 -1.208761869 -1.027517109 -0.981711990 -0.848029056
[11] -0.809689052 -0.678832827 -0.574025735 -0.554839320 -0.509889638
[16] -0.502089844 -0.455731547 -0.424468236 -0.343728630 -0.300323734
[21] -0.288451556 -0.188242567 -0.139732427 -0.137601990 -0.083517129
[26] -0.009441695  0.018491182  0.063308320  0.094458225  0.145796550
[31]  0.184200096  0.193462918  0.286655660  0.296894739  0.340814704
[36]  0.436575933  0.445344391  0.455784057  0.609317046  0.684856461
[41]  0.714905811  0.784777207  0.803642616  0.878443730  1.014727110
[46]  1.182792891  1.544940127  1.859003832  2.852197035  3.049627080
 > environment(xCDF)$y
  [1] 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26  
0.28
[15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54  
0.56
[29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82  
0.84
[43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00

My testing suggests that `save` function will retain the envirmonment  
but the `dump` function will not.

> For example:
>
> x = rnrom(50); xCDF = ecdf(x); xCDF(0.3)
> # This value tells you what fraction of x is less than 0.3
>
> You can also assign this behavior to a function:
>
> F <- function(z) { xCDF(z) }
>
> F does not inherit xCDF directly though and looses the step-function- 
> ness of
> the xCDF object. (Compare plots of F and xCDF to see one consequence)
>
> Not correct. Steps are still there in the same locations.
>
> Yes, but
>
> > "stepfun" %in% class(xCDF)
> TRUE
>
> > "stepfun" %in% class(F)
> FALSE
>
> is what I meant. plot.stepfun() gives those nice little dots at the  
> jumps that plot.function() doesn't -- hence my reference to the  
> graph. It is an admittedly minor difference though.

The plots looked identical to me. Not sure what difference you are  
referring to.
>
>
>
> So yes, you can do subtraction on this basis
>
> x = rnrom(50); Fx = ecdf(x); Fx <- function(z) { xCDF(z) }
>
> You are adding an unnecessary function "layer". Try (after  
> correcting the misspelling):
>
> xCDF(seq(-2,2,by=0.02)) == Fx(seq(-2,2,by=0.02)) # => creating Fx is  
> superfluous
>
> x <- function(x){function(x) x}  <==> x <- function(x){ x}
>
> "Turtles all the way down."
>
> Just a stupid typo: meant to define xCDF = ecdf(x) as before: I know  
> the extra function term is silly.  I do like turtles  
> though...preferably in chocolate than soup
>
>
>
> y = rnrom(50); yCDF = ecdf(x); Fy <- function(z) { yCDF(z) }
>
> F <- function(z) {Fx(z) - Fy(z)}
> # F <- function(z) {xCDF(z)-yCDF(z)} # Another way to do the same  
> thing
>
> As this would have this:
>
> F = function(z) xCDF(z)-yCDF(z)
> plot(seq(-2,2,by=0.02), F(seq(-2,2,by=0.02)) ,type="l")
>
> Interesting plot by the way. Unit steps at Gaussian random  
> intervals. I'm not sure my intuition would have gotten there all on  
> its own. I guess that arises from the discreteness of the sampling.  
> I wasn't think that ecdf was the inverse function but seem to  
> remember someone (some bloke named Weylandt, now that I check)   
> saying as much earlier in the day.
>
>
> I take it back. Not necessarily unit jumps, Quantized, yes, but the  
> sample I'm looking at has jumps of 0,1,2, and 3  * 0.02 units.   
> Poisson?  (Probably a homework problem in Feller.)
>
> That is a fun little puzzle: now I have something to ponder on the  
> train tonight.
>
> And don't listen to that Weylandt bloke, I have it on quite good  
> authority he doesn't actually know what he's doing.
>
> (By the way, is this a reference to the question about qexp()  
> earlier today? I hope I said that was the inverse CDF, not the CDF  
> itself: if so, I owe someone quite an apology...)

It was, but not in a negative sense.  The notion that the quantile  
function is the inverse of the CDF seemed perfectly fine to me. One  
maps from probability [0,1] to the outside world, the other maps from  
a set of observations to probability.

  I was remarking on the fact that the construction of two eCDFs at  
equal quantiles made for some interesting properties in the  
differences of such eCDFs. Random musings on random walks, you could  
say.

-- 
David.
>
>
> (who has learned not to tempt the gods with imprecise references to  
> R's class functionality :-) )

David 'not a god" Winsemius, MD
West Hartford, CT



More information about the R-help mailing list