Iterative clustering is a method for clustering points with the goal of making evenly-sized clusters. It can be considered a really basic way to perform hierarchical clustering.
Some clustering algorithms may yield sub-optimal results when asked for a large number of clusters. While this is generally not a problem for unsupervised ML, it is a problem if you know how many clusters you are going to need, for example when splitting similar data into N clusters.
Let’s say you want to end up with K clusters. Instead of asking the clustering algorithm for K clusters, just ask it to split your data in two.
- Find the biggest cluster, or the only cluster
- Split the points into two clusters
- If there are K distinct cluster labels, stop. Otherwise, go to 1.