[R] Help with normal distributions

Christian Hennig fm3a004 at math.uni-hamburg.de
Mon Oct 4 13:08:28 CEST 2004


Hi Michael,

> Secondly, and perhaps more difficult, is a second data set.  This, when
> plotted as a histogram, has two clear peaks, perhaps even three, all of
> which look as though they are normally distributed.  So the theory is
> that my data set is actually made up of two, possibly three, underlying
> sub-sets of data which are normally distributed, but with different
> means and standard deviations.  So 1) how do I test for this? And 2) how
> can I estimate the parameters (mean and SD) for the underlying
> distributions?

The answer to 2, as pointed out already, is to use EMclust in package
mclust.
Testing for the presence of a mixture is difficult from a theoretical
point of view, and as far as I know, nothing is already implemented in R.
What you can do is:
a) Let EMclust estimate the number of mixture components by BIC (it can
also decide for only one component).
b) Use a standard normality test such as shapiro.test to exclude homogeneous
normality. This tells you that you have to fit something more complex than
a single normal, but it does not tell you what.

Christian

> 
> Thanks in advance for your help
> 
> Mick
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

***********************************************************************
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag-online.de




More information about the R-help mailing list