Updating collection representations for federated search

M. Shokouhi, M. Baillie, L. Azzopardi

Research output: Contribution to conferencePaper

Abstract

To facilitate the search for relevant information across a set of online distributed collections, a federated information retrieval system typically represents each collection, centrally, by a set of vocabularies or sampled documents. Accurate retrieval is therefore related to how precise each representation reflects the underlying content stored in that collection. As collections evolve over time, collection representations should also be updated to reflect any change, however, a current solution has not yet been proposed. In this study we examine both the implications of out-of-date representation sets on retrieval accuracy, as well as proposing three different policies for managing necessary updates. Each policy is evaluated on a testbed of forty-four dynamic collections over an eight-week period. Our findings show that out-of-date representations significantly degrade performance over time, however, adopting a suitable update policy can minimise this problem.
Original languageEnglish
Number of pages8
Publication statusUnpublished - Jul 2007
Event30th annual international ACM SIGIR conference on Research and development in information retrieval - Amsterdam, Netherlands
Duration: 23 Jul 200727 Jul 2007

Conference

Conference30th annual international ACM SIGIR conference on Research and development in information retrieval
CountryNetherlands
CityAmsterdam
Period23/07/0727/07/07

Fingerprint

Information retrieval systems
Testbeds

Keywords

  • information retrieval systems
  • vocabularies
  • collection representations
  • retrieval accuracy

Cite this

Shokouhi, M., Baillie, M., & Azzopardi, L. (2007). Updating collection representations for federated search. Paper presented at 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, Netherlands.
Shokouhi, M. ; Baillie, M. ; Azzopardi, L. / Updating collection representations for federated search. Paper presented at 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, Netherlands.8 p.
@conference{20d4d7a8edc545be96e5c19afdf808ca,
title = "Updating collection representations for federated search",
abstract = "To facilitate the search for relevant information across a set of online distributed collections, a federated information retrieval system typically represents each collection, centrally, by a set of vocabularies or sampled documents. Accurate retrieval is therefore related to how precise each representation reflects the underlying content stored in that collection. As collections evolve over time, collection representations should also be updated to reflect any change, however, a current solution has not yet been proposed. In this study we examine both the implications of out-of-date representation sets on retrieval accuracy, as well as proposing three different policies for managing necessary updates. Each policy is evaluated on a testbed of forty-four dynamic collections over an eight-week period. Our findings show that out-of-date representations significantly degrade performance over time, however, adopting a suitable update policy can minimise this problem.",
keywords = "information retrieval systems, vocabularies, collection representations, retrieval accuracy",
author = "M. Shokouhi and M. Baillie and L. Azzopardi",
year = "2007",
month = "7",
language = "English",
note = "30th annual international ACM SIGIR conference on Research and development in information retrieval ; Conference date: 23-07-2007 Through 27-07-2007",

}

Shokouhi, M, Baillie, M & Azzopardi, L 2007, 'Updating collection representations for federated search' Paper presented at 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, Netherlands, 23/07/07 - 27/07/07, .

Updating collection representations for federated search. / Shokouhi, M.; Baillie, M.; Azzopardi, L.

2007. Paper presented at 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, Netherlands.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Updating collection representations for federated search

AU - Shokouhi, M.

AU - Baillie, M.

AU - Azzopardi, L.

PY - 2007/7

Y1 - 2007/7

N2 - To facilitate the search for relevant information across a set of online distributed collections, a federated information retrieval system typically represents each collection, centrally, by a set of vocabularies or sampled documents. Accurate retrieval is therefore related to how precise each representation reflects the underlying content stored in that collection. As collections evolve over time, collection representations should also be updated to reflect any change, however, a current solution has not yet been proposed. In this study we examine both the implications of out-of-date representation sets on retrieval accuracy, as well as proposing three different policies for managing necessary updates. Each policy is evaluated on a testbed of forty-four dynamic collections over an eight-week period. Our findings show that out-of-date representations significantly degrade performance over time, however, adopting a suitable update policy can minimise this problem.

AB - To facilitate the search for relevant information across a set of online distributed collections, a federated information retrieval system typically represents each collection, centrally, by a set of vocabularies or sampled documents. Accurate retrieval is therefore related to how precise each representation reflects the underlying content stored in that collection. As collections evolve over time, collection representations should also be updated to reflect any change, however, a current solution has not yet been proposed. In this study we examine both the implications of out-of-date representation sets on retrieval accuracy, as well as proposing three different policies for managing necessary updates. Each policy is evaluated on a testbed of forty-four dynamic collections over an eight-week period. Our findings show that out-of-date representations significantly degrade performance over time, however, adopting a suitable update policy can minimise this problem.

KW - information retrieval systems

KW - vocabularies

KW - collection representations

KW - retrieval accuracy

UR - http://www.sigir2007.org/

UR - http://www.cis.strath.ac.uk/research/publications/papers/strath_cis_publication_1975.pdf

UR - http://dl.acm.org/citation.cfm?id=1277741&coll=DL&dl=ACM&CFID=91974121&CFTOKEN=79154954

M3 - Paper

ER -

Shokouhi M, Baillie M, Azzopardi L. Updating collection representations for federated search. 2007. Paper presented at 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, Netherlands.