Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers

Vincent W. Neo, Stephan Weiss, Simon W. McKnight, Aidan O. T. Hogg, Patrick A. Naylor

Research output: Contribution to conferencePaperpeer-review

12 Citations (Scopus)
40 Downloads (Pure)

Abstract

Voice activity detection (VAD) algorithms are essential for many speech processing applications, such as speaker diarization, automatic speech recognition, speech enhancement, and speech coding. With a good VAD algorithm, non-speech segments can be excluded to improve the performance and computation of these applications. In this paper, we propose a polynomial eigenvalue decomposition-based target-speaker VAD algorithm to detect unseen target speakers in the presence of competing talkers. The proposed approach uses frame-based processing across multi-microphones to compute the syndrome energy, used for testing the presence or absence of a target speaker. The proposed approach is consistently among the best in F1 and balanced accuracy scores over the investigated range of signal to interference ratio (SIR) from -10 dB to 20 dB.

Original languageEnglish
Pages1-5
Number of pages5
DOIs
Publication statusPublished - 17 Oct 2022
Event17th International Workshop on Acoustic Signal Enhancement - Bamberg, Germany
Duration: 5 Sept 20228 Sept 2022

Conference

Conference17th International Workshop on Acoustic Signal Enhancement
Abbreviated titleIWAENC 2022
Country/TerritoryGermany
CityBamberg
Period5/09/228/09/22

Keywords

  • polynomial eigenvalue decomposition
  • target speaker voice activity detection
  • speaker activity detection

Fingerprint

Dive into the research topics of 'Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers'. Together they form a unique fingerprint.

Cite this