Gaussian Mixture Models.. GMMs

Gaussian Mixture Models

  • A probabilistic model
  • Assumes all data points are generated from a mixture of finite no. of gaussian
    distributions
  • The parameters of the gaussian distributions are unknown.
  • It is a way of generalizing k-mean(or k-medoid or k-mode for that matter) clustering to use the
    co-variance structure/stats as well as the mean/central-tendency measures of latent
    gaussians.

scikit-learn

Pros:

  • Fastest for learning mixture models
  • No bias of means towards zero, or bias cluster sizes to have specific structures

Cons:

  • When there’s not enough points per mixture, estimating covariance matrices becomes
    difficult
  • Number of components; will always use all the components it has access to, so might need
    missing or test-reserved data..

  • No. of components can be chosen based on BIC criterion.

  • Variational Bayesian Gaussian mixture avoids having to specify number of components

Variational Bayesian Gaussian Mixture

Fitting a Gaussian model to data