Transactions on Mass-Data Analysis of Images and Signals (ISSN:1868-6451)
Volume 1 - Number 1 - September 2009 - Pages 48-61
Discovering Similar Frequent Fragments in Drug Design: A Clustering-Based Approach
B. Yılmaz, M. Göktürk
Gebze Institute of Technology, Department of Computer Engineering, Gebze, Kocaeli, Turkey
Designing new medical drugs requires analysis of many molecules that have an activity for a specific disease. The main goal of these extensive analyses is to discover active substructures (fragment) that account for the activity of these molecules. Once these fragments are discovered, they are used to synthesize new drugs for the disease. Current approaches for discovering active fragments are heavily based on the frequent subgraph mining algorithms that search for exactly repeating morphological substructures within a graph database. However, in this paper, we argue that, in many settings, active fragments do not repeat exactly the same but with some fine differences. This prevents frequent subgraph mining approaches to discover these fragments. In this work, we propose a clustering based approach to discover similar substructures that repeat in active molecules in a molecular graph database. We have experimentally compared our approach with the current methods using real-life and synthesized datasets. Our experiments show that the proposed approach is successful in determining fragments that are responsible for the desired biological activity and unlike other methods it can determine frequent substructures that repeat in the graphs with some fine differences.