Vocal cord movement abnormalities are diagnosed by subjective visual assessmentusing endoscopy. Objective measures through image processing have been proposedin previous studies to overcome the drawback of subjectivity in the current clinicalpractice. However, they have mainly focussed on quantifying high-speed vocal cordvibrations using specialist expensive acquisition systems. An approach moreapplicable to routine clinics would be to quantify the slower vocal cord movements,i.e., abduction and adduction, which are recordable at normal camera capture rates.Moreover, in the UK, the flexible fibreoptic endoscope is preferred for primarydiagnosis, but it renders poorer image quality than that by the rigid laryngoscopecommonly used for objective assessment. Therefore, in this thesis, a generalisabletechnique is developed by quantifying vocal cord abduction and adduction throughnovel image processing techniques for videos acquired at the routine voice clinic.In the absence of publicly available data of vocal cord motion, acquired at thevoice clinic using flexible fibreoptic endoscopy and normal camera capture, such adatabase is created in this work with 30 videos of normal and abnormal cases. A 5-category scale is designed for quantifying vocal motion because clinicians do notcurrently have a numerical grading system. Vocal cord motion in the video databaseis graded using the proposed rating scale by six clinicians through subjective visualassessment. Inter- and intra-rater agreement and reliability measures are computed toevaluate their performance using the scale. Furthermore, ground truth scores of vocalcord motion are obtained using the clinicians’ ratings for all the videos in the database.A novel framework is presented for the localisation and segmentation of the glottal area in a given image sequence of vocal cord abduction or adduction from thedatabase. The challenges specific to abducting and adducting vocal cords fromfibreoptic endoscopy videos are addressed since algorithms developed in previousstudies for vibrating vocal cords with rigid endoscopy cannot be directly applied to thepresent database. In particular, the honeycomb artefact is suppressed, and a knowledge-based approach is proposed for glottis localisation and removal of spatialglottal drift, utilising a single user-defined reference point in one frame of a sequence.Techniques are proposed for image enhancement, initial contour estimation usingSUSAN edge detection and thresholding, and glottal area segmentation with alocalised region-based level set method. These techniques form a novel frameworkthat accounts for the variation in shape, size and illumination of the glottal area in asequence of abducting/adducting vocal cords. A novel model called SynGlotIm is developed to create synthetic image sequences of the glottal area during abduction and adduction. Analogous to the head phantoms used in MRI, this model is the first of its kind to synthesise glottal images over a realistic range of abduction angles, intensity inhomogeneity patterns of the glottal area, image contrast, blurring and noise, through modification of its input parameters. Four synthetic sequences that simulate real ones from the database are segmented. The similarity between the segmented contours from the synthetic and real images demonstrate that SynGlotIm can be used to validate segmentation algorithms. Thus, this technique serves as an alternative to the laborious and time-consuming process of generating manually marked ground truth contours by clinicians. The quantification of vocal cord abduction/adduction has so far only been achieved by measuring the angle between the straight edges of the vocal cords, which is prone to inaccuracies such as due to the tilt of the endoscope. Therefore, a novel approach is proposed wherein optical flow is used for motion estimation of the vocal cord edges and two optical flow features are extracted to generate a symmetry score from 0 to 1, where higher values indicate better symmetry. Of the two features, the Histogram of Oriented Optical Flow (HOOF) provides a better estimate of the degree of symmetry in paralysed cases. Furthermore, the Maximum Abduction Angle (MAA) of the glottis is calculated automatically. Finally, an improved estimate of movement symmetry during vocal cord abduction/adduction is obtained by training a Radial Basis Function (RBF) neural network with the HOOF symmetry scores and the MAA values to generate a quantitative score. Moreover, the granularity of the proposed technique allows categorisation of cases as normal, paresis and paralysis, which has not been achieved in other studies. The proposed technique is potentially useful for evaluating post-treatment outcomes and in challenging cases such as recognition of paresis.
Date of Award | 28 May 2020 |
---|
Original language | English |
---|
Awarding Institution | - University Of Strathclyde
|
---|
Sponsors | University of Strathclyde & Beatson Institute for Cancer Research |
---|
Supervisor | Lykourgos Petropoulakis (Supervisor) & John Soraghan (Supervisor) |
---|