[R] Detection Times and Poisson Distribution
Karl Ove Hufthammer
karl at huftis.org
Wed Oct 28 09:33:00 CET 2009
On Tue, 27 Oct 2009 12:11:42 -0700 (PDT) Ben Bolker <bolker at ufl.edu>
wrote:
> This is not quite right because we have estimated the
> rate from the data -- from ?ks.test
>
...
>
> But perhaps not a bad start.
Actually, it is a very bad start. Using estimated parameters in tests
like ks.test gives you a *completely* wrong distribution of the test
statistic and the resulting p-value. Here's a simple example:
library(MASS)
n=20
r=1
f=function(n,r)
{
x=rexp(n,rate=r);
ks.test(x,"pexp",rate=r)$p.value
}
g=function(n,r)
{
x=rexp(n,rate=r);
ks.test(x,"pexp",rate=1/mean(x))$p.value
}
truehist(replicate(1000, f(n,r)), h=.1, col="wheat")
truehist(replicate(1000, g(n,r)), h=.1, col="wheat")
Note that increasing the number of observations n does *not* help. Also
note that under the null distribution, the parameter estimation mostly
has an effect on the power; i.e., it *reduces* the probability of a type
I error, and very much so. I'm not sure what the effect under the non-
null alternative is, but I know there have been written several papers
on this topic.
--
Karl Ove Hufthammer
More information about the R-help
mailing list