[R] MCLUST Covariance Parameterization.
maj at stats.waikato.ac.nz
Mon Jun 7 09:29:08 CEST 2004
I'm not going to directly answer your question but it seems to me that
you want to fit a completely unconstrained Gaussian mixture model. This
may not be the best thing to do as without constraints on the Sigma_k
the model may have more parameters than it is reasonable to try to
estimate with the available data.
A preliminary fit of a mixture model, even if it does not have quite the
covariance structure that you want gives a clustering of the data that
you can use to explore the correlation structure of the components
empirically: then you can revise the covariance structure and re-fit.
I think that this sort of approach is likely to be more effective than
fitting the fully unstructured model directly.
KKThird at Yahoo.Com wrote:
> Hello all (especially MCLUS users).
> I'm trying to make use of the MCLUST package by C. Fraley and A. Raftery. My problem is trying to figure out how the (model) identifier (e.g, EII, VII, VVI, etc.) relates to the covariance matrix. The parameterization of the covariance matrix makes use of the method of decomposition in Banfield and Rraftery (1993) and Fraley and Raftery (2002) where
> Sigma_k = lambda_k*D_k*A_k*D_k^'
> where Sigma_k is the covariance matrix for the kth (k=1,...,G), lambda_k is the kth groups constant of proportionality, D_k is the orthogonal matrix of eigenvectors for the kth group, and A_k is a diagonal matrix whose elements are proportional to the eigenvalues. The parameterization of the covariance matrix Sigma_k depends on the distribution (whether spherical, diagonal, or ellipsoidal), volume (equal or variable), shape (equal or variable), and orientation (coordinate axes, equal, or variable). The distribution, volume, shape and orientation are a function of lambda_k, D_k, and A_k. Thus, depending on whether or not these values are constant across class defines Sigma_k.
> What I'm trying to figure out is how the distribution, volume, shape, and orientation relate to Sigma_k. As far as the parameterization of Sigma_k, what do "distribution," "volume," "shape," and "orientation" even mean. Does a table exist of how these values relate to the Sigma_k? I know a table exists in the MCLUST software manual on the MCLUST website, but this table doesn't relate the values of distribution, volume, shape, and orientation to Sigma_k directly, only to how Sigma_k would be parameterized (this isn’t helpful unless you know what distribution, volume, shape and orientation mean in terms of the within class covariance matrix) So, just what do the distribution, volume, shape, and orientation mean in the context of Sigma_k?
> What do the distribution, volume, shape, and orientation mean for a Sigma_k=sigma^2*I where I is a p by p covariance matrix, sigma^2 is the constant variance and Sigma_1=Sigma_2=....=Sigma_G. What about when a Sigma_k=sigma^2_k*I, or when Sigma_1=Sigma_2=....=Sigma_G in situations where each element of the (constant across class) covariance matrix is different?
> I would say I have a pretty good understanding of finite mixture modeling, but nothing I've read (expect the works cited in the 2002 JASA paper) talks about parameterizing the Sigma_k matrix in such a way. It would be nice to specify a structure directly for Sigma_k (as most books talk about). Any help on this issue would be greatly appreciated.
> [[alternative HTML version deleted]]
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz Fax 7 838 4155
Phone +64 7 838 4773 wk +64 7 849 6486 home Mobile 021 1395 862
More information about the R-help