background preloader

Median

Facebook Twitter

K-means++ In data mining, k-means++[1][2] is an algorithm for choosing the initial values (or "seeds") for the k-means clustering algorithm.

K-means++

It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. It is similar to the first of three seeding methods proposed, in independent work, in 2006[3] by Rafail Ostrovsky, Yuval Rabani, Leonard Schulman and Chaitanya Swamy. (The distribution of the first seed is different.) Background[edit] Clustering - Introduction.