In distributed information retrieval systems, document overlaps occur frequently across results from different resources. This is especially the case for meta-search engines which merge results from several web search engines. This paper addresses the problem of merging results exploiting overlaps in order to achieve better performance. New algorithms for merging results are proposed, which take advantage of the use of duplicate documents in two ways: one correlates scores from different results; the other regards duplicates as increasing evidence of being relevant to the given query. An extensive experimentation has demonstrated that these methods are effective.
|Title of host publication||Proceedings of the ACM SIGIR 2003 Workshop on Distributed Information retrieval|
|Number of pages||16|
|Publication status||Published - 2004|
|Name||Lecture Notes in Computer Science|
- information retrieval
- search engines
- search relevance