Iterative clustering is a method for clustering points with the goal of making evenly-sized clusters. It can be considered a really basic way to perform hierarchical clustering.

Some clustering algorithms may yield sub-optimal results when asked for a large number of clusters. While this is generally not a problem for unsupervised ML, it is a problem if you know how many clusters you are going to need, for example when splitting similar data into N clusters.

Let’s say you want to end up with K clusters. Instead of asking the clustering algorithm for K clusters, just ask it to split your data in two.

  1. Find the biggest cluster, or the only cluster
  2. Split the points into two clusters
  3. If there are K distinct cluster labels, stop. Otherwise, go to 1.