A polynomial subspace projection approach for the detection of weak voice activity

Vincent W. Neo, Stephan Weiss, Patrick A. Naylor

Research output: Contribution to conferencePaperpeer-review

9 Citations (Scopus)
33 Downloads (Pure)

Abstract

A voice activity detection (VAD) algorithm identifies whether or not time frames contain speech. It is essential for many military and commercial speech processing applications, including speech enhancement, speech coding, speaker identification, and automatic speech recognition. In this work, we adopt earlier work on detecting weak transient signals and propose a polynomial subspace projection pre-processor to improve an existing VAD algorithm. The proposed multi-channel pre-processor projects the microphone signals onto a lower dimensional subspace which attempts to remove the interferer components and thus eases the detection of the speech target. Compared to applying the same VAD to the microphone signal, the proposed approach almost always improves the F1 and balanced accuracy scores even in adverse environments, e.g. -30 dB SIR, which may be typical of operations involving noisy machinery and signal jamming scenarios.
Original languageEnglish
Pages1-5
Number of pages5
DOIs
Publication statusPublished - 14 Sept 2022
Event11th International Conference in Sensor Signal Processing for Defence: from Sensor to Decision - London, United Kingdom
Duration: 13 Sept 202214 Sept 2022
Conference number: 11th
https://sspd.eng.ed.ac.uk/

Conference

Conference11th International Conference in Sensor Signal Processing for Defence
Abbreviated titleSSPD 2022
Country/TerritoryUnited Kingdom
CityLondon
Period13/09/2214/09/22
Internet address

Keywords

  • voice activity detection
  • polynomial matrix eigenvalue decomposition
  • multi-channel signal processing

Fingerprint

Dive into the research topics of 'A polynomial subspace projection approach for the detection of weak voice activity'. Together they form a unique fingerprint.

Cite this