Adaptive query-based sampling of distributed collections

Mark Baillie, Leif Azzopardi, Fabio Crestani

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

16 Citations (Scopus)


As part of a Distributed Information Retrieval system a description
of each remote information resource, archive or repository is
usually stored centrally in order to facilitate resource selection. The acquisition
of precise resource descriptions is therefore an important phase
in Distributed Information Retrieval, as the quality of such representations
will impact on selection accuracy, and ultimately retrieval performance.
While Query-Based Sampling is currently used for content
discovery of uncooperative resources, the application of this technique is
dependent upon heuristic guidelines to determine when a sufficiently accurate
representation of each remote resource has been obtained. In this
paper we address this shortcoming by using the Predictive Likelihood to
provide both an indication of the quality of an acquired resource description
estimate, and when a sufficiently good representation of a resource
has been obtained during Query-Based Sampling.
Original languageEnglish
Title of host publicationProceedings of the 13th International Conference on String Processing and Information Retrieval
Place of PublicationBerlin, Heidelberg
Number of pages13
ISBN (Print)3-540-45774-7, 978-3-540-45774-9
Publication statusPublished - 2006

Publication series



  • Distributed Information Retrieval System
  • Query-Based Sampling
  • Predictive Likelihood


Dive into the research topics of 'Adaptive query-based sampling of distributed collections'. Together they form a unique fingerprint.

Cite this