[R] outlier detection methods in r?

Mon Apr 24 16:18:54 CEST 2000

At 05:35 21/04/00 -0700, Robert L. Sandefur wrote:
>and calculating the probablity of a value of 4 or bigger  in 100 samples of
>norm(0,1) gives
>> 1-exp(log(pnorm(4,0,1))*100)
>[1] 0.003162164

I do not understand the above formula. I'd do it as follows: if p is the
probability to get a value of 4 or bigger from a normal distribution with
mean=1 and var=1, then the probability to get one (and only one) value
equal to or greater than 4 in 100 independent draws from the same normal
law is given by the pdf of the binomial law:

> 1-pnorm(4,0,1) -> p
> dbinom(1, 100, p)
[1] 0.003157209

which is slightly smaller than what is reported by Rob. However, the
probability to get *at least* one value of 4 or bigger in the same sample is:

> 1-pbinom(0, 100, p)
[1] 0.003162164

which is identical to the reported value. The probabilities to get one,
two, three, or four values equal to or greater than 4 are:

> dbinom(1:4, 100, p)
[1] 3.157209e-03 4.949797e-06 5.121192e-09 3.933342e-12

They can easily be summed:

> sum(dbinom(1:4, 100, p))
[1] 0.003162164

Emmanuel Paradis
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._