Convolutional neural networks for pathological voice detection

Research output: Contribution to conferencePaper

1 Citation (Scopus)

Abstract

Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.

Conference

Conference40th International Conference of the IEEE Engineering in Medicine and Biology Society
Abbreviated titleEMBC 2018
CountryUnited States
CityHonolulu, Hawaii
Period17/07/1821/07/18
Internet address

Fingerprint

Neural networks
Feature extraction
Signal processing
Screening
Classifiers
Acoustics
Testing

Keywords

  • acoustic analysis
  • signal processing
  • pathological voice detection
  • healthy voice detection

Cite this

Wu, H., Soraghan, J., Lowit, A., & Di Caterina, G. (2018). Convolutional neural networks for pathological voice detection. Paper presented at 40th International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, Hawaii, United States.
Wu, Huiyi ; Soraghan, John ; Lowit, Anja ; Di Caterina, Gaetano. / Convolutional neural networks for pathological voice detection. Paper presented at 40th International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, Hawaii, United States.4 p.
@conference{d7f8d841ecd942d9b2c3ce279383ab4a,
title = "Convolutional neural networks for pathological voice detection",
abstract = "Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5{\%}, 66.2{\%} and 77.0{\%} accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.",
keywords = "acoustic analysis, signal processing, pathological voice detection, healthy voice detection",
author = "Huiyi Wu and John Soraghan and Anja Lowit and {Di Caterina}, Gaetano",
year = "2018",
month = "7",
day = "17",
language = "English",
note = "40th International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2018 ; Conference date: 17-07-2018 Through 21-07-2018",
url = "https://embc.embs.org/2018/",

}

Wu, H, Soraghan, J, Lowit, A & Di Caterina, G 2018, 'Convolutional neural networks for pathological voice detection' Paper presented at 40th International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, Hawaii, United States, 17/07/18 - 21/07/18, .

Convolutional neural networks for pathological voice detection. / Wu, Huiyi; Soraghan, John; Lowit, Anja; Di Caterina, Gaetano.

2018. Paper presented at 40th International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, Hawaii, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Convolutional neural networks for pathological voice detection

AU - Wu, Huiyi

AU - Soraghan, John

AU - Lowit, Anja

AU - Di Caterina, Gaetano

PY - 2018/7/17

Y1 - 2018/7/17

N2 - Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.

AB - Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.

KW - acoustic analysis

KW - signal processing

KW - pathological voice detection

KW - healthy voice detection

UR - https://embc.embs.org/2018/

M3 - Paper

ER -

Wu H, Soraghan J, Lowit A, Di Caterina G. Convolutional neural networks for pathological voice detection. 2018. Paper presented at 40th International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, Hawaii, United States.