Transactions on Case-Based Reasoning
Volume 4 - Number 1 - October 2011 - Pages 3-17
The Problem of Normalization and a Normalized Similarity Measure by Online Data
A. Attig and P. Perner
Institute of Computer Vision and Applied Computer Sciences, IBAI, Leipzig, Germany
Case-based reasoning, image or data retrieval is based on similarity determination between the actual case and the cases in a database. It is preferable to normalize the similarity values between 0 and 1 in order to be able to compare different similarity values based on a scale. Similarity is thus imparted with a semantic meaning. The main problem arises when the case base is not yet complete and contains only a small number of cases while the other cases are collected incrementally as soon as they arrive in the system. In this case the upper and lower bounds of the feature values cannot be estimated close to the real values. This paper concerns possible methods for predicting the upper and lower bounds of a feature value and the problems that arise when these values are not correctly estimated due to a limited number of samples or a parameter distribution that is not available a-priori. The aim is to develop a method for learning the upper and the lower bounds of a feature value and to develop a methodology for dealing with change in semantic meaning of the similarity.
Download Paper (1652 KB)