Untangling result list refinement and ranking quality: a framework for evaluation and prediction

Jiyin He, Marc Bron, Arjen de Vries, Leif Azzopardi, Maarten de Rijke

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

2 Citations (Scopus)


Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and filters, making user search behavior increasingly dynamic. We develop an evaluation framework that takes a step beyond the interaction assumption of traditional evaluation metrics and allows for batch evaluation of systems with and without RLR elements. In our framework we model user interaction as switching between different sublists. This provides a measure of user effort based on the joint effect of user interaction with RLR elements and result quality. We validate our framework by conducting a user study and comparing model predictions with real user performance. Our model predictions show significant positive correlation with real user effort. Further, in contrast to traditional evaluation metrics, the predictions using our framework, of when users stand to benefit from RLR elements, reflect findings from our user study.

Finally, we use the framework to investigate under what conditions systems with and without RLR elements are likely to be effective. We simulate varying conditions concerning ranking quality, users, task and interface properties demonstrating a cost-effective way to study whole system performance.
Original languageEnglish
Title of host publicationProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
Place of PublicationNew York, NY, USA
Number of pages10
Publication statusPublished - 9 Aug 2015
Externally publishedYes


  • evaluation
  • search behavior
  • faceted search


Dive into the research topics of 'Untangling result list refinement and ranking quality: a framework for evaluation and prediction'. Together they form a unique fingerprint.

Cite this