Abstract
Many tasks related to or supporting information retrieval, such as query expansion, automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. “coffee” is a drink, “red” is a color, while “steak” is not a drink and “big” is not a color). We present a novel framework to automatically validate a membership in an arbitrary, not a trained a priori semantic category up to a desired level of accuracy. Our approach does not rely on any manually codified knowledge but instead capitalizes on the diversity of topics and word usage in a large corpus (e.g. World Wide Web). Using TREC factoid questions that expect the answer to belong to a specific semantic category, we show that a very high level of accuracy can be reached by automatically identifying more training seeds and more training patterns when needed. We develop a specific quantitative validation model that takes uncertainty and redundancy in the training data into consideration. We empirically confirm the important aspects of our model through ablation studies.
| Original language | English |
|---|---|
| Title of host publication | Advances in Information Retrieval Theory |
| Subtitle of host publication | Third International Conference, ICTIR 2011, Bertinoro, Italy, September 12-14, 2011. Proceedings |
| Pages | 274-284 |
| Number of pages | 11 |
| Volume | 6931 |
| DOIs | |
| Publication status | Published - 2011 |
Keywords
- information retrieval
- IR
- semantic searching
- semantic category verification
Fingerprint
Dive into the research topics of 'Towards semantic category verification with arbitrary precision'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver