Transactions on Machine Learning and Data Mining (ISSN: 1865-6781)

Volume 3 - Number 2 - October 2010 - Pages 67-79

Fast Algorithms for Constant Approximation k-Means Clustering

M. Song and S. Rajasekaran

Computer Science and Engineering, University of Connecticut
Storrs CT 06269, USA


In this paper we study the k-means clustering problem. It is well-known that the general version of this problem is NP-hard. Numerous approximation algorithms have been proposed for this problem. In this paper, we propose three constant approximation algorithms for k-means clustering. The first algorithm runs in time Ο((k/ε)knd), where k is the number of clusters, n is the size of input points, and d is dimension of attributes. The second algorithm runs in time Ο(k3n2 log n). This is the first algorithm for k-means clustering that runs in time polynomial in n, k and d simultaneously. The run time of the third algorithm Ο(k5 log3 kd) is independent of n. Though an algorithm whose run time is independent of n is known for the k-median problem, ours is the first such algorithm for the k-means problem.

PDFDownload Paper (135 KB)

Back to Table of Contents