Projects per year
Abstract
The transformers architecture and transfer learning have radically modified the Natural Language Processing (NLP) landscape, enabling new applications in fields where open source labelled datasets are scarce. Space systems engineering is a field with limited access to large labelled corpora and a need for enhanced knowledge reuse of accumulated design data. Transformers models such as the
Bidirectional Encoder Representations from Transformers (BERT) and the Robustly Optimised BERT Pretraining Approach (RoBERTa) are however trained on general corpora. To answer the need for domain specific contextualised word embedding in the space field, we propose Space Transformers, a novel family
of three models, SpaceBERT, SpaceRoBERTa and SpaceSciBERT, respectively further pre-trained from BERT, RoBERTa and SciBERT on our domain-specific corpus. We collect and label a new dataset of space systems concepts based on space standards. We fine-tune and compare our domain-specific models to their
general counterparts on a domain-specific Concept Recognition (CR) task. Our study rightly demonstrates that the models further pre-trained on a space corpus outperform their respective baseline models in the Concept Recognition task, with SpaceRoBERTa achieving significant higher ranking overall.
Bidirectional Encoder Representations from Transformers (BERT) and the Robustly Optimised BERT Pretraining Approach (RoBERTa) are however trained on general corpora. To answer the need for domain specific contextualised word embedding in the space field, we propose Space Transformers, a novel family
of three models, SpaceBERT, SpaceRoBERTa and SpaceSciBERT, respectively further pre-trained from BERT, RoBERTa and SciBERT on our domain-specific corpus. We collect and label a new dataset of space systems concepts based on space standards. We fine-tune and compare our domain-specific models to their
general counterparts on a domain-specific Concept Recognition (CR) task. Our study rightly demonstrates that the models further pre-trained on a space corpus outperform their respective baseline models in the Concept Recognition task, with SpaceRoBERTa achieving significant higher ranking overall.
Original language | English |
---|---|
Pages (from-to) | 133111-133122 |
Number of pages | 12 |
Journal | IEEE Access |
Volume | 9 |
DOIs | |
Publication status | Published - 24 Sept 2021 |
Keywords
- language model
- transformers
- space systems
- concept recognition
- requirements
Fingerprint
Dive into the research topics of 'Space transformers: language modeling for space systems'. Together they form a unique fingerprint.Projects
- 1 Finished
-
ESA NPI: Design Engineering Assitant
Riccardi, A. (Principal Investigator) & Berquand, A. (Researcher)
1/01/18 → 30/06/21
Project: Research - Studentship
Datasets
-
Data for "SpaceTransformers: language modeling for space systems"
Berquand, A. (Creator), Darm, P. (Contributor) & Riccardi, A. (Contributor), University of Strathclyde, 13 Dec 2021
DOI: 10.15129/3c19e737-9054-4892-8ee5-4c4c7f406410
Dataset
-
Dataset of space systems corpora (Thesis data)
Berquand, A. (Creator) & Riccardi, A. (Contributor), University of Strathclyde, 8 Dec 2021
DOI: 10.15129/8e1c3353-ccbe-4835-b4f9-bffd6b5e058b
Dataset