Abstract
To enhance Knowledge Reuse in the field of space mission design, the implementation of Information Retrieval (IR) is key. Topic Modeling (TM) is used to identify, learn and extract topics from a corpus of documents, and can therefore support several IR tasks such as categorisation. This study relies on a common TM method, Latent Dirichlet Allocation (LDA), a probability-based approach. An extensive Wikipedia-based corpus focused on space mission design is collected, parsed, preprocessed, and used to train a general ’Space Mission Design’ LDA model. The LDA model is optimised based on the perplexity measure for a range of topics numbers. The topics dictionaries of the retained model are labelled by human annotators, with labels corresponding to spacecraft subsystems. The performances of the general model are evaluated against a set of space mission requirements with a categorisation task. The general model is then used as a base to generate specific LDA models focused on one topic, or spacecraft subsystem. The general LDA model developed in this study proves to be a solid base for the generation of focused LDA models, yielding very high accuracy scores and Mean Reciprocal Ranking.Finally, a semi-supervised LDA model, fed with lexical priors is trained, leading to improved performances of a general model
Original language | English |
---|---|
Number of pages | 11 |
Publication status | Published - 25 Oct 2019 |
Event | 70th International Astronautical Congress - Washington D.C., United States Duration: 21 Oct 2019 → 25 Oct 2019 https://www.iac2019.org/ |
Conference
Conference | 70th International Astronautical Congress |
---|---|
Abbreviated title | IAC |
Country/Territory | United States |
City | Washington D.C. |
Period | 21/10/19 → 25/10/19 |
Internet address |
Keywords
- topic modeling
- LDA
- machine learning
- categorisation
- mission requirements
- visual assistant