Abstract
Part 1 of this series described the lexical isolation and categorization of the text tokens of statements describing generic structures in the texts of Documentation Abstracts from Derwent Publications Ltd.;1 this paper describes the syntactic and semantic processing of the tokens with a view to producing the corresponding GENSAL expressions. The syntactic analysis proceeds as an expectation-driven process; the result of the analysis is then validated by semantic information associated with each token. The prototype system can satisfactorily process 86% of the 545 descriptions studied. Routines for processing variable expressions, multiplier expressions, nested parameter expressions, nested substitutions, compound token declarations, and conditional expressions are described. Messages are automatically produced calling for manual intervention in the 14% of statements which are beyond the scope of the prototype system. The prototype could be implemented either for retrospective conversion of databases of generic chemical structures from printed sources or could be adapted to serve as an intelligent editor during preparation of patent abstracts.
Original language | English |
---|---|
Pages (from-to) | 468-473 |
Number of pages | 6 |
Journal | Journal of Chemical Information and Computer Sciences |
Volume | 32 |
Issue number | 5 |
DOIs | |
Publication status | Published - 1 Sept 1992 |
Keywords
- automatic interpretation
- texts
- chemical patent abstracts
- semantic processing
- natural language processing
- patent abstracts