TY - GEN
T1 - Cognitively inspired speech processing for multimodal hearing technology
AU - Abel, Andrew
AU - Hussain, Amir
AU - Luo, Bin
N1 - Publisher Copyright: © 2014 IEEE.
A. Abel, A. Hussain and B. Luo, "Cognitively inspired speech processing for multimodal hearing technology," 2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE), Orlando, FL, USA, 2014, pp. 56-63, doi: 10.1109/CICARE.2014.7007834.
PY - 2015/1/12
Y1 - 2015/1/12
N2 - In recent years, the link between the various human communication production domains has become more widely utilised in the field of speech processing. Work by the authors and others has demonstrated that intelligently integrated audio and visual information can be used for speech enhancement. This advance in technology means that the use of visual information as part of hearing aids or assistive listening devices is becoming ever more viable. One issue that is not commonly explored is how a multimodal system copes with variations in data quality and availability, such as a speaker covering their face while talking, or the existence of multiple speakers in a conversational scenario, an issue that a hearing device would be expected to cope with by switching between different programmes and settings to adapt to changes in the environment. We present the Challeng AV audiovisual corpus, which is used to evaluate a novel fuzzy logic based audiovisual switching system, designed to be used as part of a next-generation adaptive, autonomous, context aware hearing system. Initial results show that the detectors are capable of determining environmental conditions and responding appropriately, demonstrating the potential of such an adaptive multimodal system as part of a state of the art hearing aid device.
AB - In recent years, the link between the various human communication production domains has become more widely utilised in the field of speech processing. Work by the authors and others has demonstrated that intelligently integrated audio and visual information can be used for speech enhancement. This advance in technology means that the use of visual information as part of hearing aids or assistive listening devices is becoming ever more viable. One issue that is not commonly explored is how a multimodal system copes with variations in data quality and availability, such as a speaker covering their face while talking, or the existence of multiple speakers in a conversational scenario, an issue that a hearing device would be expected to cope with by switching between different programmes and settings to adapt to changes in the environment. We present the Challeng AV audiovisual corpus, which is used to evaluate a novel fuzzy logic based audiovisual switching system, designed to be used as part of a next-generation adaptive, autonomous, context aware hearing system. Initial results show that the detectors are capable of determining environmental conditions and responding appropriately, demonstrating the potential of such an adaptive multimodal system as part of a state of the art hearing aid device.
KW - visualization
KW - speech
KW - input variables
KW - detectors
KW - fuzzy logic
KW - noise
KW - speech enhancement
KW - audio-visual systems
KW - multimodal system
UR - http://www.scopus.com/inward/record.url?scp=84922553453&partnerID=8YFLogxK
U2 - 10.1109/CICARE.2014.7007834
DO - 10.1109/CICARE.2014.7007834
M3 - Conference contribution book
AN - SCOPUS:84922553453
T3 - IEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - CICARE 2014: 2014 IEEE Symposium on Computational Intelligence in Healthcare and e-Health, Proceedings
SP - 56
EP - 63
BT - IEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - CICARE 2014
T2 - 2014 2nd IEEE Symposium on Computational Intelligence in Healthcare and e-Health, CICARE 2014
Y2 - 9 December 2014 through 12 December 2014
ER -