Abstract
Language | English |
---|---|
Title of host publication | Advances in Information Retrieval |
Place of Publication | Germany |
Publisher | Springer |
Pages | 303-323 |
Number of pages | 20 |
Volume | 2291 |
ISBN (Print) | 978-3-540-43343-9 |
DOIs | |
Publication status | Published - 27 Mar 2002 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Fingerprint
Keywords
- bilingual dictionaries
- web documents
- information retrieval
- searching
Cite this
}
Building bilingual dictionaries from parallel web documents. / McEwan, C.J.A.; Ounis, I.; Ruthven, I.; Crestani, F. (Editor); Lalmas, M. (Editor).
Advances in Information Retrieval. Vol. 2291 Germany : Springer, 2002. p. 303-323 (Lecture Notes in Computer Science).Research output: Chapter in Book/Report/Conference proceeding › Chapter
TY - CHAP
T1 - Building bilingual dictionaries from parallel web documents
AU - McEwan, C.J.A.
AU - Ounis, I.
AU - Ruthven, I.
A2 - Crestani, F.
A2 - Lalmas, M.
PY - 2002/3/27
Y1 - 2002/3/27
N2 - In this paper we describe a system for automatically constructing a bilingual dictionary for cross-language information retrieval applications. We describe how we automatically target candidate parallel documents, filter the candidate documents and process them to create parallel sentences. The parallel sentences are then automatically translated using an adaptation of the EMIM technique and a dictionary of translation terms is created. We evaluate our dictionary using human experts. The evaluation showed that the system performs well. In addition the results obtained from automatically-created corpora are comparable to those obtained from manually created corpora of parallel documents. Compared to other available techniques, our approach has the advantage of being simple, uniform, and easy-to-implement while providing encouraging results.
AB - In this paper we describe a system for automatically constructing a bilingual dictionary for cross-language information retrieval applications. We describe how we automatically target candidate parallel documents, filter the candidate documents and process them to create parallel sentences. The parallel sentences are then automatically translated using an adaptation of the EMIM technique and a dictionary of translation terms is created. We evaluate our dictionary using human experts. The evaluation showed that the system performs well. In addition the results obtained from automatically-created corpora are comparable to those obtained from manually created corpora of parallel documents. Compared to other available techniques, our approach has the advantage of being simple, uniform, and easy-to-implement while providing encouraging results.
KW - bilingual dictionaries
KW - web documents
KW - information retrieval
KW - searching
UR - http://www.cis.strath.ac.uk/research/publications/papers/strath_cis_publication_143.pdf
U2 - 10.1007/3-540-45886-7_20
DO - 10.1007/3-540-45886-7_20
M3 - Chapter
SN - 978-3-540-43343-9
VL - 2291
T3 - Lecture Notes in Computer Science
SP - 303
EP - 323
BT - Advances in Information Retrieval
PB - Springer
CY - Germany
ER -