Transactions on Machine Learning and Data Mining (ISSN: 1865-6781)
Volume 6 - Number 2 - July 2013 - Pages 45-80
A Phrase-based Ontology Enabled Semantic Processing System for Web Search
Joseph Leone and Dong-Guk Shin1
1University of Connecticut, USA
Semantic processing system (SPS) is a system that performs phrase search of web content. SPS takes a user query in natural language, converts it to a keyword query, expands the keyword query with synonyms, hypernyms, hy- ponyms, and meronyms, and presents the keyword query to a search engine. SPS then sifts through the search engine result pages extracting grammatical and semantic information from each page for computing the page’s relevance to the natural language query. SPS' relevance computation uses semantic matching of phrases rather than term-and-document frequency weighting—a method that is most commonly used by existing web search engines. SPS consults an ontol- ogy that is both “crowd-sourced,” i.e., built collaboratively and incrementally by the large number of users and “auto-learned,” i.e., contextually inferred from sentences containing desired words. SPS would be suitable for the areas of bi- omedical literature mining, legal document review and discovery, and news/RSS feed monitoring because these are laden with prose text. We imple- mented a prototype SPS, experimented with it and demonstrate that SPS outper- forms a representative keyword based search engine. The strength of SPS stems from its exploitation of phrase semantics, which is not used in the conventional search engines.
Download Paper (348 KB)