Predicting hospital readmission or a patient fatality often happens to be resource- and life- saving, thus is a very important and challenging task for NLP/IR and machine learning applications in e-Health domain. While many successful approaches exist to predict such clinical events based on categorical and numerical variables, most of health records consist of raw text clinical notes. However, the models taking advantage of the free from natural language found in those notes rarely reach the accuracy level acceptable for the clinicians. In addition, in spite of their success in other domains, the superiority of deep neural approaches over classical bags of words has not yet been convincingly demonstrated for this task. Using a publicly available dataset with clinical notes, we have explored several text classification models to predict patient re-admission or a fatality and established that 1) The performance of our deep neural models exceed those based on bag of words by several percentage points. 2) This allows to achieve the accuracy typically acceptable for the clinicians as of practical
use (area under the ROC curve above .70). 3) Our model based on averaging n-gram embeddings works the best, exceeding more specialized ones such as recurrent, convolutional, and attention-based transformer models. 4) Our modifications in the attention-based transformer model suggested here to overcome its input size limit are crucial to achieve the top performance.
|UK Healthcare Text Analytics Conference 2021
|17/06/21 → 18/06/21
- clinical events
- raw text
- bag of words
- attention-based transformers