Minimum-entropy data partitioning using reversible jump Markov chain Monte Carlo
Roberts SJ., Holmes C., Denison D.
Problems in data analysis often require the unsupervised partitioning of a data set into classes. Several methods exist for such partitioning but many have the weakness of being formulated via strict parametric models (e.g., each class is modeled by a single Gaussian) or being computationally intensive in high-dimensional data spaces. We reconsider the notion of such cluster analysis in information-theoretic terms and show that an efficient partitioning may be given via a minimization of partition entropy. A reversible-jump sampling is introduced to explore the variable-dimension space of partition models.