Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.
|Title of host publication||SIGIR '13 Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval|
|Place of Publication||New York, NY, USA|
|Number of pages||4|
|Publication status||Published - 28 Jul 2013|