Relating retrievability, performance and length

Colin Wilkie, Leif Azzopardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

16 Citations (Scopus)

Abstract

Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.
LanguageEnglish
Title of host publicationSIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
Place of PublicationNew York, NY, USA
Pages937-940
Number of pages4
DOIs
Publication statusPublished - 28 Jul 2013
Externally publishedYes

Fingerprint

performance
normalization
information retrieval
trend

Keywords

  • retrievability
  • simulation

Cite this

Wilkie, C., & Azzopardi, L. (2013). Relating retrievability, performance and length. In SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 937-940). New York, NY, USA. https://doi.org/10.1145/2484028.2484145
Wilkie, Colin ; Azzopardi, Leif. / Relating retrievability, performance and length. SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, 2013. pp. 937-940
@inproceedings{95720aef57a14c6880f260dfa6440cad,
title = "Relating retrievability, performance and length",
abstract = "Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.",
keywords = "retrievability, simulation",
author = "Colin Wilkie and Leif Azzopardi",
year = "2013",
month = "7",
day = "28",
doi = "10.1145/2484028.2484145",
language = "English",
isbn = "978-1-4503-2034-4",
pages = "937--940",
booktitle = "SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval",

}

Wilkie, C & Azzopardi, L 2013, Relating retrievability, performance and length. in SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, pp. 937-940. https://doi.org/10.1145/2484028.2484145

Relating retrievability, performance and length. / Wilkie, Colin; Azzopardi, Leif.

SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, 2013. p. 937-940.

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - Relating retrievability, performance and length

AU - Wilkie, Colin

AU - Azzopardi, Leif

PY - 2013/7/28

Y1 - 2013/7/28

N2 - Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.

AB - Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.

KW - retrievability

KW - simulation

U2 - 10.1145/2484028.2484145

DO - 10.1145/2484028.2484145

M3 - Conference contribution book

SN - 978-1-4503-2034-4

SP - 937

EP - 940

BT - SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

CY - New York, NY, USA

ER -

Wilkie C, Azzopardi L. Relating retrievability, performance and length. In SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA. 2013. p. 937-940 https://doi.org/10.1145/2484028.2484145