[R] What ECDF function?

Shiazy Fuzzy shiazy at gmail.com
Sat Jun 9 18:57:56 CEST 2007


Hello!

I want to plot a P-P plot. So I've implemented this function:

ppplot <- function(x,dist,...)
{
  pdf <- get(paste("p",dist,sep=""),mode="function");
  x <- sort(x);
  plot( pdf(x,...),  ecdf(x)(x));
}

I have two questions:
1. Is it right to draw as reference line the following:

    xx <- pdf(x,...);
    yy <- ecdf(x)(x);
    l <- lm(  yy ~ xx )
    abline( l$coefficients );

  or what else is better?

2.I found various version of P-P plot  where instead of using the
"ecdf" function use ((1:n)-0.5)/n
  After investigation I found there're different definition of ECDF
(note "i" is the rank):
  * Kaplan-Meier: i/n
  * modified Kaplan-Meier: (i-0.5)/n
  * Median Rank: (i-0.3)/(n+0.4)
  * Herd Johnson i/(n+1)
  * ...
  Furthermore, similar expressions are used by "ppoints".
  So,
  2.1 For P-P plot, what shall I use?
  2.2 In general why should I prefer one kind of CDF over another one?

  (Note: this issue might also apply to Q-Q plot, infact qqnorm use
ppoints instead of ecdf)

Thank you very much!!

Sincerely,

-- Marco



More information about the R-help mailing list