Acoustic-based assistive technology tools for dysarthria managment

  • Tolulope Bamidele Ijitona

Student thesis: Doctoral Thesis


The research presented in this thesis addresses the concepts of application of important digital signal processing algorithms in the detection and treatment of dysarthria, a neurological motor speech disorder. The novel algorithms presented in this thesis include a silence, unvoiced and voiced segmentation technique for dysarthric speech based on linear prediction error variance (LPEV), an automatic diadochokinetic (DDK) analysis and segmentation scheme for dysarthric speech, the application of speech processing algorithms in the extraction of prosodic, voice quality, pronunciation and wavelet features for the detection and severity classification in dysarthric speech and the modification of dysarthric speech features using speech enhancement techniques to improve the intelligibility of dysarthric speech in a stress production exercise for the treatment of dysarthria.In particular, an improved silence, unvoiced and voiced segmentation technique for dysarthric speech is proposed. This method is an enhanced technique that makes use of a two-layer segmentation approach which combines the short-time-energy (STE) and LPEV to distinctly differentiate between the silence and voiced segments despite the reduced/inconsistent intensity, pauses, voice breaks and slow speech rate experienced in dysarthric speech. Including the LPEV into the segmentation process has proved to be advantageous in eliminating segmentation errors due to the similarity observed between the STE profiles of the silence and voiced segments in dysarthric speech. The experimental results have shown that this segmentation method is also effective and efficient in reducing the effects of artefacts introduced in dysarthric speech.;A novel automatic DDK analysis scheme is proposed in this research to extract individual DDK syllables and analyse them for consistency. This method is based on a speaker-specific moving average threshold (rather than a fixed threshold) which addresses the varying intensities in the DDK sounds produced by speakers with dysarthria. This method also addresses the challenge of intra-syllable breaks introduced in dysarthric DDK syllables using a minimum distance merging approach. In addition, the algorithm analyses the segmented DDK syllables by calculating the individual DDK rates and their variance in order to measure the DDK syllable production consistency. The high accuracy of the proposed method is tested and verified using both dysarthric and healthy controlled databases.Three novel schemes for automatic detection and severity classification of the dysarthric speech are also proposed in this research. One extracts an extended speech feature called centroid formant (which is a representation of energy concentration in the frequency spectrum) and classifies these centroid formants using neural network classifiers for the detection of dysarthria. The centroid formant-based detection scheme also forms the backbone for the development of the second and more robust detection scheme which combines centroid formants with prosodic, voice quality, pronunciation and wavelet features for more efficient classification. A third scheme is developed specifically for the classification of dysarthria into three severity levels using the same features as in the second scheme. The efficiencies of these detection and severity classification schemes are evaluated by calculating the accuracy, sensitivity and specificity of the classifiers.The effects of modification of prosodic cues used in stress production on the ability of listeners to correctly identify the position of the stressed word in sentences are also investigated in this research. This investigation is focused on the three prosodic cues used by healthy controlled speakers in stress production; namely intensity, duration and fundamental frequency. These three features are modified acoustically and presented to untrained listeners in an aim to evaluate the effects of the individual and combined modifications on the listeners' perception. The findings of this investigation will help clinicians, including speech and language therapists, make an informed decision on the prosodic feature to focus on during stress production exercises for the management of dysarthria.Finally, the dysarthria management schemes proposed in this research are developed into user-interactive tools in MATLAB from which speaker-specific information and reports can be generated and downloaded for progress monitoring and further analytical purposes.
Date of Award20 Apr 2020
Original languageEnglish
Awarding Institution
  • University Of Strathclyde
SponsorsUniversity of Strathclyde
SupervisorHong Yue (Supervisor) & Anja Lowit (Supervisor)

Cite this