Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation

Andrew Abel*, Amir Hussain, Quoc Dinh Nguyen, Fabien Ringeval, Mohamed Chetouani, Maurice Milgram

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

9 Citations (Scopus)

Abstract

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM) approach developed by the authors is used for automatic lip tracking, and an adapted version of our vowel based speech segmentation system is employed to automatically segment speech. Canonical Correlation Analysis (CCA) on segmented and non segmented data in a range of noisy speech environments finds that segmented speech has a significantly better audiovisual correlation, demonstrating the feasibility of our techniques for further development as part of a proposed audiovisual speech enhancement system.

Original languageEnglish
Title of host publicationBiometric ID Management and Multimodal Communication
Subtitle of host publicationJoint COST 2101 and 2102 International Conference
EditorsJulian Fierrez, Javier Ortega-Garcia, Anna Esposito, Andrzej Drygajlo, Marcos Faundez-Zanuy
Place of PublicationCham, Switzerland
PublisherSpringer
Pages65-72
Number of pages8
ISBN (Electronic)9783642043918
ISBN (Print)3642043909, 9783642043901
DOIs
Publication statusPublished - 7 Sept 2009
EventJoint COST 2101 and 2102 International Conference on Biometric ID Management and Multimodal Communication, BioID_MultiComm 2009 - Madrid, Spain
Duration: 16 Sept 200918 Sept 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5707 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceJoint COST 2101 and 2102 International Conference on Biometric ID Management and Multimodal Communication, BioID_MultiComm 2009
Country/TerritorySpain
CityMadrid
Period16/09/0918/09/09

Keywords

  • canonical correlation
  • canonical correlation analysis
  • noisy environment
  • speech enhancement
  • visual speech

Fingerprint

Dive into the research topics of 'Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation'. Together they form a unique fingerprint.

Cite this