[Rd] ppoints (PR#7538)

Tobias Verbeke tobias.verbeke at telenet.be
Mon Jan 24 18:32:57 CET 2005


On Mon, 24 Jan 2005 09:37:44 +0000 (GMT)
Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:

> On Wed, 19 Jan 2005 tobias.verbeke at telenet.be wrote:
> 
> > Dear r-bugs,
> >
> > Whilst playing with ppoints I discovered
> > that when one uses it directly, occasional
> > NA's in a vector also become data fractions:
> >
> > ppoints(c(1,2,NA,4))
> >
> > Would it be a good idea to add a warning message
> > as in:
> >
> > ppoints <- function (n, a = ifelse(n <= 10, 3/8, 1/2))
> > {
> >    if(any(is.na(n))) warning("'n' contains NA's")
> >    if(length(n) > 1) n <- length(n)
> >    if(n > 0)
> >        (1:n - a)/(n + 1-2*a)
> >    else numeric(0)
> > }
> 
> Why?  There are 4 points in your vector, and the result is perfectly 
> valid as documented, even if they were all NAs.

When using ppoints in order to draw a quantile plot to have a first look
at a distribution, I almost forgot (read: I did) to remove the NAs.
For example, Chambers, Cleveland et al. (1983), Graphical Methods
for Data Analysis, p. 15 Fig. 2.4:

"Stamford" <-
c(66, 52, NA, NA, NA, NA, 49, 64, 68, 26, 86, 52, 43, 75, 87,
188, 118, 103, 82, 71, 103, 240, 31, 40, 47, 51, 31, 47, 14,
NA, 71, 61, 47, NA, 196, 131, 173, 37, 47, 215, 230, NA, 69,
98, 125, 94, 72, 72, 125, 143, 192, NA, 122, 32, 114, 32, 23,
71, 38, 136, 169, 152, 201, 134, 206, 92, 101, 119, 124, 133,
83, NA, 60, 124, 142, 124, 64, 75, 103, NA, 46, 68, NA, 87, 27,
NA, 73, 59, 119, 64, NA, 111, 80, 68, 24, 24, 82, 100, 55, 91,
87, 64, NA, NA, 170, NA, 86, 202, 71, 85, 122, 155, 80, 71, 28,
212, 80, 24, 80, 169, 174, 141, 202, 113, 38, 38, 28, 52, 14,
38, 94, 89, 99, 150, 146, 113, 38, 66, 38, 80, 80, 99, 71, 42,
52, 33, 38, 24, 61, 108, 38, 28, NA)

xco <- ppoints(na.omit(Stamford))
yco <- sort(Stamford)
plot(xco, yco,
     pch = 20,
     xlab = "FRACTION OF DATA",
     ylab = "QUANTILES OF OZONE DATA",
     cex = 0.6)


> > Another minor remark concerning ?ppoints. It says:
> >
> > n: either the number of points generate or a vector of
> >          observations.     ^^^^^
> 
> As you see, that does not line up, but the typo has been fixed.

Thank you for your answer (and fix).
Tobias

> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-devel mailing list