Building simulated queries for known-item topics: an analysis using six european languages

Leif Azzopardi, Maarten de Rijke, Krisztian Balog

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

68 Citations (Scopus)

Abstract

There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.
LanguageEnglish
Title of host publicationSIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Place of PublicationNew York, NY, USA
Pages455-462
Number of pages8
DOIs
Publication statusPublished - 23 Jul 2007
Externally publishedYes

Fingerprint

language
evaluation
information retrieval
performance

Keywords

  • multilingual retrieval
  • query simulation
  • query generation

Cite this

Azzopardi, L., de Rijke, M., & Balog, K. (2007). Building simulated queries for known-item topics: an analysis using six european languages. In SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 455-462). New York, NY, USA. https://doi.org/10.1145/1277741.1277820
Azzopardi, Leif ; de Rijke, Maarten ; Balog, Krisztian. / Building simulated queries for known-item topics : an analysis using six european languages. SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, 2007. pp. 455-462
@inproceedings{a394d2c0e961429785329f2734eca0d7,
title = "Building simulated queries for known-item topics: an analysis using six european languages",
abstract = "There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.",
keywords = "multilingual retrieval, query simulation, query generation",
author = "Leif Azzopardi and {de Rijke}, Maarten and Krisztian Balog",
year = "2007",
month = "7",
day = "23",
doi = "10.1145/1277741.1277820",
language = "English",
isbn = "978-1-59593-597-7",
pages = "455--462",
booktitle = "SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval",

}

Azzopardi, L, de Rijke, M & Balog, K 2007, Building simulated queries for known-item topics: an analysis using six european languages. in SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, pp. 455-462. https://doi.org/10.1145/1277741.1277820

Building simulated queries for known-item topics : an analysis using six european languages. / Azzopardi, Leif; de Rijke, Maarten; Balog, Krisztian.

SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA, 2007. p. 455-462.

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - Building simulated queries for known-item topics

T2 - an analysis using six european languages

AU - Azzopardi, Leif

AU - de Rijke, Maarten

AU - Balog, Krisztian

PY - 2007/7/23

Y1 - 2007/7/23

N2 - There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.

AB - There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.

KW - multilingual retrieval

KW - query simulation

KW - query generation

U2 - 10.1145/1277741.1277820

DO - 10.1145/1277741.1277820

M3 - Conference contribution book

SN - 978-1-59593-597-7

SP - 455

EP - 462

BT - SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

CY - New York, NY, USA

ER -

Azzopardi L, de Rijke M, Balog K. Building simulated queries for known-item topics: an analysis using six european languages. In SIGIR '07 Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA. 2007. p. 455-462 https://doi.org/10.1145/1277741.1277820