Query length, retrievability bias and performance

Colin Wilkie, Leif Azzopardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

1 Citation (Scopus)


Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.
Original languageEnglish
Title of host publicationCIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
Place of PublicationNew York, NY, USA
Number of pages4
Publication statusPublished - 17 Oct 2015
Externally publishedYes


  • evaluation
  • retrievability
  • query length


Dive into the research topics of 'Query length, retrievability bias and performance'. Together they form a unique fingerprint.

Cite this