The size of a document archive is a very important parameter for resource selection in distributed information retrieval systems. In this paper, we present a method for automatically detecting the size (ie the number of documents) of a document archive, in case the archive itself does not provide such information. In addition, a method for detecting incremental change of the archive size is also presented, which can be useful for deciding if a resource description has become obsolete and needs to be regenerated. An experimental evaluation of these methods shows that they provide quite acurate information.
|Title of host publication||Advances in information retrieval : 25th European Conference on IR Research, ECIR 2003, Pisa, Italy, April 14-16, 2003 : proceedings|
|Place of Publication||Berlin, Germany|
|Number of pages||10|
|Publication status||Published - Apr 2003|
|Name||Lecture notes on computer science|
- information retrieval systems
- document archive
Wu, S., Gibb, F., Crestani, F., & Sebastiani, F. (Ed.) (2003). Experiments with document archive size detection. In Advances in information retrieval : 25th European Conference on IR Research, ECIR 2003, Pisa, Italy, April 14-16, 2003 : proceedings (Vol. 2633, pp. 294-304). (Lecture notes on computer science). Berlin, Germany: Springer.