The accessibility dimension for structured document retrieval

Thomas Roelleke, Mounia Lalmas, Gabriella Kazai, Ian Ruthven, Stefan Quicker, F. Crestani (Editor), M. Dunlop (Editor), S. Mizzaro (Editor)

Research output: Chapter in Book/Report/Conference proceedingChapter

13 Citations (Scopus)

Abstract

Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values.
LanguageEnglish
Title of host publicationAdvances in Information Retrieval
Place of PublicationGermany
PublisherSpringer
Pages284-302
Number of pages18
Volume2291
ISBN (Print)978-3-540-43343-9
DOIs
Publication statusPublished - 25 Mar 2002

Publication series

NameLecture Notes in Computer Science
PublisherSpringer

Fingerprint

Algebra
Experiments

Keywords

  • structured document retrieval
  • probabilistic relational algebra
  • accessibility dimension

Cite this

Roelleke, T., Lalmas, M., Kazai, G., Ruthven, I., Quicker, S., Crestani, F. (Ed.), ... Mizzaro, S. (Ed.) (2002). The accessibility dimension for structured document retrieval. In Advances in Information Retrieval (Vol. 2291, pp. 284-302). (Lecture Notes in Computer Science). Germany: Springer. https://doi.org/10.1007/3-540-45886-7
Roelleke, Thomas ; Lalmas, Mounia ; Kazai, Gabriella ; Ruthven, Ian ; Quicker, Stefan ; Crestani, F. (Editor) ; Dunlop, M. (Editor) ; Mizzaro, S. (Editor). / The accessibility dimension for structured document retrieval. Advances in Information Retrieval. Vol. 2291 Germany : Springer, 2002. pp. 284-302 (Lecture Notes in Computer Science).
@inbook{c28796aff7374eeea555d7127b2eb906,
title = "The accessibility dimension for structured document retrieval",
abstract = "Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values.",
keywords = "structured document retrieval, probabilistic relational algebra, accessibility dimension",
author = "Thomas Roelleke and Mounia Lalmas and Gabriella Kazai and Ian Ruthven and Stefan Quicker and F. Crestani and M. Dunlop and S. Mizzaro",
year = "2002",
month = "3",
day = "25",
doi = "10.1007/3-540-45886-7",
language = "English",
isbn = "978-3-540-43343-9",
volume = "2291",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "284--302",
booktitle = "Advances in Information Retrieval",

}

Roelleke, T, Lalmas, M, Kazai, G, Ruthven, I, Quicker, S, Crestani, F (ed.), Dunlop, M (ed.) & Mizzaro, S (ed.) 2002, The accessibility dimension for structured document retrieval. in Advances in Information Retrieval. vol. 2291, Lecture Notes in Computer Science, Springer, Germany, pp. 284-302. https://doi.org/10.1007/3-540-45886-7

The accessibility dimension for structured document retrieval. / Roelleke, Thomas; Lalmas, Mounia; Kazai, Gabriella; Ruthven, Ian; Quicker, Stefan; Crestani, F. (Editor); Dunlop, M. (Editor); Mizzaro, S. (Editor).

Advances in Information Retrieval. Vol. 2291 Germany : Springer, 2002. p. 284-302 (Lecture Notes in Computer Science).

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - The accessibility dimension for structured document retrieval

AU - Roelleke, Thomas

AU - Lalmas, Mounia

AU - Kazai, Gabriella

AU - Ruthven, Ian

AU - Quicker, Stefan

A2 - Crestani, F.

A2 - Dunlop, M.

A2 - Mizzaro, S.

PY - 2002/3/25

Y1 - 2002/3/25

N2 - Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values.

AB - Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values.

KW - structured document retrieval

KW - probabilistic relational algebra

KW - accessibility dimension

U2 - 10.1007/3-540-45886-7

DO - 10.1007/3-540-45886-7

M3 - Chapter

SN - 978-3-540-43343-9

VL - 2291

T3 - Lecture Notes in Computer Science

SP - 284

EP - 302

BT - Advances in Information Retrieval

PB - Springer

CY - Germany

ER -

Roelleke T, Lalmas M, Kazai G, Ruthven I, Quicker S, Crestani F, (ed.) et al. The accessibility dimension for structured document retrieval. In Advances in Information Retrieval. Vol. 2291. Germany: Springer. 2002. p. 284-302. (Lecture Notes in Computer Science). https://doi.org/10.1007/3-540-45886-7