Query length, retrievability bias and performance

Colin Wilkie, Leif Azzopardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

1 Citation (Scopus)

Abstract

Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.
LanguageEnglish
Title of host publicationCIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
Place of PublicationNew York, NY, USA
Pages1787-1790
Number of pages4
DOIs
Publication statusPublished - 17 Oct 2015
Externally publishedYes

Fingerprint

trend
performance
normalization

Keywords

  • evaluation
  • retrievability
  • query length

Cite this

Wilkie, C., & Azzopardi, L. (2015). Query length, retrievability bias and performance. In CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 1787-1790). New York, NY, USA. https://doi.org/10.1145/2806416.2806604
Wilkie, Colin ; Azzopardi, Leif. / Query length, retrievability bias and performance. CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, NY, USA, 2015. pp. 1787-1790
@inproceedings{249714fec1424c88a3b6719517c2f9dc,
title = "Query length, retrievability bias and performance",
abstract = "Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.",
keywords = "evaluation, retrievability, query length",
author = "Colin Wilkie and Leif Azzopardi",
year = "2015",
month = "10",
day = "17",
doi = "10.1145/2806416.2806604",
language = "English",
isbn = "978-1-4503-3794-6",
pages = "1787--1790",
booktitle = "CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management",

}

Wilkie, C & Azzopardi, L 2015, Query length, retrievability bias and performance. in CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, NY, USA, pp. 1787-1790. https://doi.org/10.1145/2806416.2806604

Query length, retrievability bias and performance. / Wilkie, Colin; Azzopardi, Leif.

CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, NY, USA, 2015. p. 1787-1790.

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - Query length, retrievability bias and performance

AU - Wilkie, Colin

AU - Azzopardi, Leif

PY - 2015/10/17

Y1 - 2015/10/17

N2 - Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.

AB - Past work has shown that longer queries tend to lead to better retrieval performance. However, this comes at the cost of increased user effort effort and additional system processing. In this paper, we examine whether there are benefits of longer queries beyond performance. We posit that increasing the query length will also lead to a reduction in the retrievability bias. Additionally, we speculate that to minimise retrievability bias as queries become longer, more length normalisation must be applied to account for the increase in the length of documents retrieved. To this end, we perform a retrievability analysis on two TREC collections using three standard retrieval models and various lengths of queries (one to five terms). From this investigation we find that increasing the length of queries reduces the overall retrievability bias but at a decreasing rate. Moreover, once the query length exceeds three terms the bias can begin to increase (and the performance can start to drop). We also observe that more document length normalisation is typically required as query length increases, in order to minimise bias. Finally, we show that there is a strong correlation between performance and retrieval bias. This work raises some interesting questions regarding query length and its affect on performance and bias. Further work will be directed towards examining longer and more verbose queries, including those generated via query expansion methods, to obtain a more comprehensive understanding of the relationship between query length, performance and retrievability bias.

KW - evaluation

KW - retrievability

KW - query length

UR - http://dl.acm.org/citation.cfm?id=2806604&CFID=841144603&CFTOKEN=57778393

U2 - 10.1145/2806416.2806604

DO - 10.1145/2806416.2806604

M3 - Conference contribution book

SN - 978-1-4503-3794-6

SP - 1787

EP - 1790

BT - CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

CY - New York, NY, USA

ER -

Wilkie C, Azzopardi L. Query length, retrievability bias and performance. In CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, NY, USA. 2015. p. 1787-1790 https://doi.org/10.1145/2806416.2806604