Towards semantic category verification with arbitrary precision

Research output: Chapter in Book/Report/Conference proceedingConference contribution book


Many tasks related to or supporting information retrieval, such as query expansion, automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. “coffee” is a drink, “red” is a color, while “steak” is not a drink and “big” is not a color). We present a novel framework to automatically validate a membership in an arbitrary, not a trained a priori semantic category up to a desired level of accuracy. Our approach does not rely on any manually codified knowledge but instead capitalizes on the diversity of topics and word usage in a large corpus (e.g. World Wide Web). Using TREC factoid questions that expect the answer to belong to a specific semantic category, we show that a very high level of accuracy can be reached by automatically identifying more training seeds and more training patterns when needed. We develop a specific quantitative validation model that takes uncertainty and redundancy in the training data into consideration. We empirically confirm the important aspects of our model through ablation studies.
Original languageEnglish
Title of host publicationAdvances in Information Retrieval Theory
Subtitle of host publicationThird International Conference, ICTIR 2011, Bertinoro, Italy, September 12-14, 2011. Proceedings
Number of pages11
Publication statusPublished - 2011


  • information retrieval
  • IR
  • semantic searching
  • semantic category verification


Dive into the research topics of 'Towards semantic category verification with arbitrary precision'. Together they form a unique fingerprint.

Cite this