论文标题
通过半决赛编程的素描和解决k-均值聚类的方法
Sketch-and-solve approaches to k-means clustering by semidefinite programming
论文作者
论文摘要
我们介绍了一种素描和解决方法,以加快K-均值聚类的彭 - 韦半决赛放松。当数据适当分离时,我们确定K-均值的最佳聚类。否则,我们的方法在最佳K-均值值上提供了高信心的下限。该下限是数据驱动的;它没有对数据或如何生成的任何假设。我们提供代码和一组广泛的数值实验,其中我们使用这种方法来证明K-Means ++获得的聚类解决方案的近似最佳性。
We introduce a sketch-and-solve approach to speed up the Peng-Wei semidefinite relaxation of k-means clustering. When the data is appropriately separated we identify the k-means optimal clustering. Otherwise, our approach provides a high-confidence lower bound on the optimal k-means value. This lower bound is data-driven; it does not make any assumption on the data nor how it is generated. We provide code and an extensive set of numerical experiments where we use this approach to certify approximate optimality of clustering solutions obtained by k-means++.