Transactions on Machine Learning and Data Mining (ISSN: 1865-6781)
Volume 1 - Number 2 - October 2008 - Page 67-82
Distributed Monitoring of Frequent Items
R. Fuller and M. Kantardzic
Computer Engineering and Computer Science Department, University of Louisville, USA
AbstractMonitoring frequently occuring items is a recurring task in a variety of applications. Although a number of solutions have been proposed there has been few to address the problem in a distributed networked environment. Most past solutions relied upon approximating results to lower communication overhead. In this paper we introduce a new algorithm designed for continuously tracking frequent items over distributed data streams providing either exact or approximate answers. We tested the efficiency of our method using two real-world data sets. The results indicated significant reduction in communication cost when compared to naĻive approaches and an existing efficient algorithm called Top-K Monitoring. Since our method does not rely upon approximations to reduce communication overhead and is explicitly designed for tracking frequent items, our method also shows increased quality in its tracking results.
Download Paper (163 KB)