Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.
|Title of host publication||CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management|
|Place of Publication||New York, NY, USA|
|Number of pages||4|
|Publication status||Published - 17 Oct 2015|
- query length