An initial investigation of query expansion bias

Colin Wilkie, Leif Azzopardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Query expansion is a useful retrieval mechanism for creating more verbose queries from the users initial keyword search. Query expansion generally have multiple parameters that allowthe user to define how many terms and where those terms come from are introduced to the expanded query. However, the idea that query expansion may be introducing biases into the system by selecting terms from overly retrievable documents has never been formally evaluated. In this work, the relationship between performance and retrievability bias is explored when various query expansion methods are employed to aide retrieval. Several parameters are altered, independently, to identify those that have an impact on bias. Parameters altered include; Rocchio's beta, length normalisation parameters, the number of terms added and the number of documents those terms are extracted from. The evaluation performed here identifies a strong correlation between performance and retrievability bias, suggesting that performance is increased by making the system more biased thus more likely to pick terms from a set of overly retrievable documents.

LanguageEnglish
Title of host publicationICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval
Place of PublicationNew York
Pages285-288
Number of pages4
DOIs
StatePublished - 1 Oct 2017
Event7th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2017 - Amsterdam, Netherlands
Duration: 1 Oct 20174 Oct 2017

Conference

Conference7th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2017
CountryNetherlands
CityAmsterdam
Period1/10/174/10/17

Keywords

  • performance
  • retrievability bias
  • information retieval

Cite this

Wilkie, C., & Azzopardi, L. (2017). An initial investigation of query expansion bias. In ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval (pp. 285-288). New York. DOI: 10.1145/3121050.3121097
Wilkie, Colin ; Azzopardi, Leif. / An initial investigation of query expansion bias. ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval. New York, 2017. pp. 285-288
@inproceedings{edf24e98d602447eb06fb5b1d3bd4521,
title = "An initial investigation of query expansion bias",
abstract = "Query expansion is a useful retrieval mechanism for creating more verbose queries from the users initial keyword search. Query expansion generally have multiple parameters that allowthe user to define how many terms and where those terms come from are introduced to the expanded query. However, the idea that query expansion may be introducing biases into the system by selecting terms from overly retrievable documents has never been formally evaluated. In this work, the relationship between performance and retrievability bias is explored when various query expansion methods are employed to aide retrieval. Several parameters are altered, independently, to identify those that have an impact on bias. Parameters altered include; Rocchio's beta, length normalisation parameters, the number of terms added and the number of documents those terms are extracted from. The evaluation performed here identifies a strong correlation between performance and retrievability bias, suggesting that performance is increased by making the system more biased thus more likely to pick terms from a set of overly retrievable documents.",
keywords = "performance, retrievability bias, information retieval",
author = "Colin Wilkie and Leif Azzopardi",
year = "2017",
month = "10",
day = "1",
doi = "10.1145/3121050.3121097",
language = "English",
isbn = "9781450344906",
pages = "285--288",
booktitle = "ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval",

}

Wilkie, C & Azzopardi, L 2017, An initial investigation of query expansion bias. in ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval. New York, pp. 285-288, 7th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2017, Amsterdam, Netherlands, 1/10/17. DOI: 10.1145/3121050.3121097

An initial investigation of query expansion bias. / Wilkie, Colin; Azzopardi, Leif.

ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval. New York, 2017. p. 285-288.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - An initial investigation of query expansion bias

AU - Wilkie,Colin

AU - Azzopardi,Leif

PY - 2017/10/1

Y1 - 2017/10/1

N2 - Query expansion is a useful retrieval mechanism for creating more verbose queries from the users initial keyword search. Query expansion generally have multiple parameters that allowthe user to define how many terms and where those terms come from are introduced to the expanded query. However, the idea that query expansion may be introducing biases into the system by selecting terms from overly retrievable documents has never been formally evaluated. In this work, the relationship between performance and retrievability bias is explored when various query expansion methods are employed to aide retrieval. Several parameters are altered, independently, to identify those that have an impact on bias. Parameters altered include; Rocchio's beta, length normalisation parameters, the number of terms added and the number of documents those terms are extracted from. The evaluation performed here identifies a strong correlation between performance and retrievability bias, suggesting that performance is increased by making the system more biased thus more likely to pick terms from a set of overly retrievable documents.

AB - Query expansion is a useful retrieval mechanism for creating more verbose queries from the users initial keyword search. Query expansion generally have multiple parameters that allowthe user to define how many terms and where those terms come from are introduced to the expanded query. However, the idea that query expansion may be introducing biases into the system by selecting terms from overly retrievable documents has never been formally evaluated. In this work, the relationship between performance and retrievability bias is explored when various query expansion methods are employed to aide retrieval. Several parameters are altered, independently, to identify those that have an impact on bias. Parameters altered include; Rocchio's beta, length normalisation parameters, the number of terms added and the number of documents those terms are extracted from. The evaluation performed here identifies a strong correlation between performance and retrievability bias, suggesting that performance is increased by making the system more biased thus more likely to pick terms from a set of overly retrievable documents.

KW - performance

KW - retrievability bias

KW - information retieval

UR - http://sigir.org/ictir2017/

UR - http://www.scopus.com/inward/record.url?scp=85033221371&partnerID=8YFLogxK

U2 - 10.1145/3121050.3121097

DO - 10.1145/3121050.3121097

M3 - Conference contribution

SN - 9781450344906

SP - 285

EP - 288

BT - ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval

CY - New York

ER -

Wilkie C, Azzopardi L. An initial investigation of query expansion bias. In ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval. New York. 2017. p. 285-288. Available from, DOI: 10.1145/3121050.3121097