CLEF 2018 technologically assisted reviews in empirical medicine overview

Evangelos Kanoulas, Dan Li, Leif Azzopardi, Rene Spijker

Research output: Contribution to journalConference article

8 Citations (Scopus)
6 Downloads (Pure)

Abstract

Conducting a systematic review is a widely used method to obtain an overview over the current scientific consensus on a topic of interest, by bringing together multiple studies in a reliable, transparent way. The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying all relevant studies in an unbiased way both complex and time consuming to the extent that jeopardizes the validity of their findings and the ability to inform policy and practice in a timely manner. The CLEF 2018 e-Health Technology Assisted Reviews in Empirical Medicine task aims at evaluating search algorithms that seek to identify all studies relevant for conducting a systematic review in empirical medicine. The task had a focus on Diagnostic Test Accuracy (DTA) reviews, and consisted of two subtasks: 1) given a number of relevance criteria as described in a systematic review protocol, search a large medical database of article abstracts (PubMed) to find the studies to be included in the review, and 2) given the article abstracts retrieved by a carefully designed Boolean Query, prioritize them to reduce the effort required by experts to screen the abstracts for inclusion in the review. Seven teams participated in the task, with a total of 12 runs submitted for subtask 1 and 19 runs for subtask 2. This paper reports both the methodology used to construct the benchmark collection, and the results of the evaluation.

Original languageEnglish
Number of pages34
JournalCEUR Workshop Proceedings
Volume2125
Publication statusPublished - 24 Jul 2018
Event19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, France
Duration: 10 Sep 201814 Sep 2018

Fingerprint

Medicine
Health

Keywords

  • active learning
  • benchmarking
  • Cochrane
  • diagnostic test accuracy
  • DTA
  • e-health
  • evaluation
  • high recall
  • information retrieval
  • PubMed
  • relevance feedback
  • systematic reviews
  • TAR
  • technology assisted reviews
  • test collection
  • text classification

Cite this

@article{c36d7d2bece141a69b53d0525c547029,
title = "CLEF 2018 technologically assisted reviews in empirical medicine overview",
abstract = "Conducting a systematic review is a widely used method to obtain an overview over the current scientific consensus on a topic of interest, by bringing together multiple studies in a reliable, transparent way. The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying all relevant studies in an unbiased way both complex and time consuming to the extent that jeopardizes the validity of their findings and the ability to inform policy and practice in a timely manner. The CLEF 2018 e-Health Technology Assisted Reviews in Empirical Medicine task aims at evaluating search algorithms that seek to identify all studies relevant for conducting a systematic review in empirical medicine. The task had a focus on Diagnostic Test Accuracy (DTA) reviews, and consisted of two subtasks: 1) given a number of relevance criteria as described in a systematic review protocol, search a large medical database of article abstracts (PubMed) to find the studies to be included in the review, and 2) given the article abstracts retrieved by a carefully designed Boolean Query, prioritize them to reduce the effort required by experts to screen the abstracts for inclusion in the review. Seven teams participated in the task, with a total of 12 runs submitted for subtask 1 and 19 runs for subtask 2. This paper reports both the methodology used to construct the benchmark collection, and the results of the evaluation.",
keywords = "active learning, benchmarking, Cochrane, diagnostic test accuracy, DTA, e-health, evaluation, high recall, information retrieval, PubMed, relevance feedback, systematic reviews, TAR, technology assisted reviews, test collection, text classification",
author = "Evangelos Kanoulas and Dan Li and Leif Azzopardi and Rene Spijker",
note = "Kanoulas, E, Li, D, Azzopardi, L & Spijker, R 2018, 'CLEF 2018 technologically assisted reviews in empirical medicine overview' CEUR Workshop Proceedings, vol. 2125, online http://ceur-ws.org/Vol-2125/invited_paper_6.pdf.",
year = "2018",
month = "7",
day = "24",
language = "English",
volume = "2125",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",

}

CLEF 2018 technologically assisted reviews in empirical medicine overview. / Kanoulas, Evangelos; Li, Dan; Azzopardi, Leif; Spijker, Rene.

In: CEUR Workshop Proceedings, Vol. 2125, 24.07.2018.

Research output: Contribution to journalConference article

TY - JOUR

T1 - CLEF 2018 technologically assisted reviews in empirical medicine overview

AU - Kanoulas, Evangelos

AU - Li, Dan

AU - Azzopardi, Leif

AU - Spijker, Rene

N1 - Kanoulas, E, Li, D, Azzopardi, L & Spijker, R 2018, 'CLEF 2018 technologically assisted reviews in empirical medicine overview' CEUR Workshop Proceedings, vol. 2125, online http://ceur-ws.org/Vol-2125/invited_paper_6.pdf.

PY - 2018/7/24

Y1 - 2018/7/24

N2 - Conducting a systematic review is a widely used method to obtain an overview over the current scientific consensus on a topic of interest, by bringing together multiple studies in a reliable, transparent way. The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying all relevant studies in an unbiased way both complex and time consuming to the extent that jeopardizes the validity of their findings and the ability to inform policy and practice in a timely manner. The CLEF 2018 e-Health Technology Assisted Reviews in Empirical Medicine task aims at evaluating search algorithms that seek to identify all studies relevant for conducting a systematic review in empirical medicine. The task had a focus on Diagnostic Test Accuracy (DTA) reviews, and consisted of two subtasks: 1) given a number of relevance criteria as described in a systematic review protocol, search a large medical database of article abstracts (PubMed) to find the studies to be included in the review, and 2) given the article abstracts retrieved by a carefully designed Boolean Query, prioritize them to reduce the effort required by experts to screen the abstracts for inclusion in the review. Seven teams participated in the task, with a total of 12 runs submitted for subtask 1 and 19 runs for subtask 2. This paper reports both the methodology used to construct the benchmark collection, and the results of the evaluation.

AB - Conducting a systematic review is a widely used method to obtain an overview over the current scientific consensus on a topic of interest, by bringing together multiple studies in a reliable, transparent way. The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying all relevant studies in an unbiased way both complex and time consuming to the extent that jeopardizes the validity of their findings and the ability to inform policy and practice in a timely manner. The CLEF 2018 e-Health Technology Assisted Reviews in Empirical Medicine task aims at evaluating search algorithms that seek to identify all studies relevant for conducting a systematic review in empirical medicine. The task had a focus on Diagnostic Test Accuracy (DTA) reviews, and consisted of two subtasks: 1) given a number of relevance criteria as described in a systematic review protocol, search a large medical database of article abstracts (PubMed) to find the studies to be included in the review, and 2) given the article abstracts retrieved by a carefully designed Boolean Query, prioritize them to reduce the effort required by experts to screen the abstracts for inclusion in the review. Seven teams participated in the task, with a total of 12 runs submitted for subtask 1 and 19 runs for subtask 2. This paper reports both the methodology used to construct the benchmark collection, and the results of the evaluation.

KW - active learning

KW - benchmarking

KW - Cochrane

KW - diagnostic test accuracy

KW - DTA

KW - e-health

KW - evaluation

KW - high recall

KW - information retrieval

KW - PubMed

KW - relevance feedback

KW - systematic reviews

KW - TAR

KW - technology assisted reviews

KW - test collection

KW - text classification

UR - http://www.scopus.com/inward/record.url?scp=85051077484&partnerID=8YFLogxK

UR - http://ceur-ws.org/Vol-2125/

M3 - Conference article

VL - 2125

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -