Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers

Vincent W. Neo, Stephan Weiss, Simon W. McKnight, Aidan O. T. Hogg, Patrick A. Naylor

Research output: Contribution to conferencePaperpeer-review

4 Citations (Scopus)
7 Downloads (Pure)

Abstract

Voice activity detection (VAD) algorithms are essential for many speech processing applications, such as speaker diarization, automatic speech recognition, speech enhancement, and speech coding. With a good VAD algorithm, non-speech segments can be excluded to improve the performance and computation of these applications. In this paper, we propose a polynomial eigenvalue decomposition-based target-speaker VAD algorithm to detect unseen target speakers in the presence of competing talkers. The proposed approach uses frame-based processing across multi-microphones to compute the syndrome energy, used for testing the presence or absence of a target speaker. The proposed approach is consistently among the best in F1 and balanced accuracy scores over the investigated range of signal to interference ratio (SIR) from -10 dB to 20 dB.

Original languageEnglish
Pages1-5
Number of pages5
DOIs
Publication statusPublished - 17 Oct 2022
Event17th International Workshop on Acoustic Signal Enhancement - Bamberg, Germany
Duration: 5 Sep 20228 Sep 2022

Conference

Conference17th International Workshop on Acoustic Signal Enhancement
Abbreviated titleIWAENC 2022
Country/TerritoryGermany
CityBamberg
Period5/09/228/09/22

Keywords

  • polynomial eigenvalue decomposition
  • target speaker voice activity detection
  • speaker activity detection

Fingerprint

Dive into the research topics of 'Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers'. Together they form a unique fingerprint.

Cite this