Exploring models for semantic category verification

D. Roussinov, O. Turetken

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. 'coffee' is a drink, 'red' is a color, while 'steak' is not a drink and 'big' is not a color). In this research, we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as WordNet or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web, thus can be considered 'knowledge-light' and complementary to the 'knowledge-intensive' approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper), we have tested it empirically within our fact seeking engine on the well known TREC conference test questions. Due to our implementation of semantic verification, the answer accuracy has improved by up to 16% depending on the specific models and metrics used.
LanguageEnglish
Pages753-765
Number of pages13
JournalInformation Systems
Volume34
Issue number8
DOIs
Publication statusPublished - Dec 2009

Fingerprint

Semantics
Color
Coffee
World Wide Web
Artificial intelligence
Engines

Keywords

  • semantic category verification
  • automated question answering
  • text mining
  • world wide web
  • search engines

Cite this

Roussinov, D. ; Turetken, O. / Exploring models for semantic category verification. In: Information Systems. 2009 ; Vol. 34, No. 8. pp. 753-765.
@article{cf22c87f58d9492ba0b06ac5b83185e8,
title = "Exploring models for semantic category verification",
abstract = "Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. 'coffee' is a drink, 'red' is a color, while 'steak' is not a drink and 'big' is not a color). In this research, we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as WordNet or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web, thus can be considered 'knowledge-light' and complementary to the 'knowledge-intensive' approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper), we have tested it empirically within our fact seeking engine on the well known TREC conference test questions. Due to our implementation of semantic verification, the answer accuracy has improved by up to 16{\%} depending on the specific models and metrics used.",
keywords = "semantic category verification, automated question answering, text mining, world wide web, search engines",
author = "D. Roussinov and O. Turetken",
year = "2009",
month = "12",
doi = "10.1016/j.is.2009.03.007",
language = "English",
volume = "34",
pages = "753--765",
journal = "Information Systems",
issn = "0306-4379",
number = "8",

}

Exploring models for semantic category verification. / Roussinov, D.; Turetken, O.

In: Information Systems, Vol. 34, No. 8, 12.2009, p. 753-765.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Exploring models for semantic category verification

AU - Roussinov, D.

AU - Turetken, O.

PY - 2009/12

Y1 - 2009/12

N2 - Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. 'coffee' is a drink, 'red' is a color, while 'steak' is not a drink and 'big' is not a color). In this research, we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as WordNet or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web, thus can be considered 'knowledge-light' and complementary to the 'knowledge-intensive' approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper), we have tested it empirically within our fact seeking engine on the well known TREC conference test questions. Due to our implementation of semantic verification, the answer accuracy has improved by up to 16% depending on the specific models and metrics used.

AB - Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. 'coffee' is a drink, 'red' is a color, while 'steak' is not a drink and 'big' is not a color). In this research, we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as WordNet or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web, thus can be considered 'knowledge-light' and complementary to the 'knowledge-intensive' approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper), we have tested it empirically within our fact seeking engine on the well known TREC conference test questions. Due to our implementation of semantic verification, the answer accuracy has improved by up to 16% depending on the specific models and metrics used.

KW - semantic category verification

KW - automated question answering

KW - text mining

KW - world wide web

KW - search engines

U2 - 10.1016/j.is.2009.03.007

DO - 10.1016/j.is.2009.03.007

M3 - Article

VL - 34

SP - 753

EP - 765

JO - Information Systems

T2 - Information Systems

JF - Information Systems

SN - 0306-4379

IS - 8

ER -