Combination of Similarity Measures for Effective Spoken Document Retrieval

F. Crestani

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)


Often users of information retrieval systems and document authors use different terms to refer to the same concept. For this simple reason, information retrieval is affected by the 'term mismatch' problem. The term mismatch problem does not only have the effect of hindering the retrieval of relevant documents, it also produces bad rankings of relevant documents. A similar problem can be found in spoken document retrieval, where terms misrecognized by the speech recognition process can hinder the retrieval of potentially relevant spoken documents. We will call this problem 'term misrecognition', by analogy to the term mismatch problem. This paper presents two classes of retrieval models that attempt to tackle both the term mismatch and the term misrecognition problems at retrieval time using term similarity information. The models use either complete or partial knowledge of semantic and phonetic term similarity, evaluated using statistical methods from the corpus.
Original languageEnglish
Pages (from-to)87-96
Number of pages9
JournalJournal of Information Science
Issue number2
Publication statusPublished - 2003


  • similarity measures
  • information retrieval
  • spoken document retrieval

Fingerprint Dive into the research topics of 'Combination of Similarity Measures for Effective Spoken Document Retrieval'. Together they form a unique fingerprint.

Cite this