[R] Diptest- I'm getting significant values when I shouldn't?

kbrownk kbrownk at gmail.com
Thu Dec 22 21:50:04 CET 2011


Thanks, I found dip.test after posting. I reread the original paper
and found that the probability is that the dip is less than the given
dip score. "Less" here is ambiguous to me, and it is strange that
dip.test interpolates from the same p value lookup table I was using
(gDiptab), but returns very different p values. Anyway, the dip.test p
values seem correct when I test them on different distributions.
So, now that I can conclude there are multi-modal distributions, any
suggestions for finding the best distribution fits? I'm looking for a
procedure that can test different distribution types (normal. gamma,
etc.), provides parameters such as means and SDs for each sub-
distribution, and how quantifies how good the fits were. I'm currently
looking into Expectation Maximization methods, particularly form the
mixtools R library package. 'normalmixEM' looks like a good starting
procedure.
Thanks,kbrownk
On Dec 22, 3:32 pm, Duncan Murdoch <murdoch.dun... at gmail.com> wrote:
> On 21/12/2011 3:37 PM, kbrownk wrote:
>
> > > From library(diptest):
>
> > Shouldn't the following almost always be non-significant for
> > Hartigan's dip test?
>
> > dip(x = rnorm(1000))
>
> Well, it should be non-significant about 95% of the time
>
> > I get dip scores of around 0.0008 which based on p values taken from
> > the table (at N=1000), using the command: qDiptab, are 0.02<  p<
> > 0.05.
>
> > Anyone familiar with Hartigan's dip test and what I may not be
> > understanding?
>
> Why not use dip.test()?  When I do that, I see the p-values are almost
> all quite large:
>
> hist(replicate(1000, dip.test(x=rnorm(1000))$p.value))
>
> Using runif() gives something apparently on the boundary, as you'd expect:
>
> hist(replicate(1000, dip.test(x=runif(1000))$p.value))
>
> Duncan Murdoch
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list