Transactions on Machine Learning and Data Mining (ISSN: 1865-6781)
Volume 5 - Number 1 - July 2012 - Pages 3-22
Contrasting Correlations by an Efficient Double-Clique Condition
A.Li, M. Haraguchi and Y. Okubo
Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo 060-0814, Japan
Contrast set mining has been extensively studied to detect changes between several contrasted databases. Previous studies mainly compared the supports of an itemset and extracted the itemsets with significantly different supports across those databases. Differently, we contrast the correlations of an itemset between two contrasted databases and attempt to detect potential changes. Any highly correlated itemset is not of our concern in order to focus on implicitly emerging correlation. Therefore, we set correlation constraints (upper bounds) in both databases, and then extract the itemsets consisting of items that are not highly correlated in both databases, but exhibiting a potential change of correlations from one database to the other. We investigate both positive and negative correlations. We also investigate the correlation under conditioning by third variables. Thus, we also study so called partial correlation. To measure this kind of correlation, we use extended mutual information. In our search procedure for the correlated itemsets, we use a double-clique condition, which is necessary for itemsets to be solutions satisfying the correlation constraints. We show its usefulness through experiments.
Download Paper (177 KB)