Spectro-Temporal Attention-Based Voice Activity Detection
- Authors
- Lee, Younglo; Min, Jeongki; Han, David K.; Ko, Hanseok
- Issue Date
- 2020
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Keywords
- Deep neural networks; attention mechanism; voice activity detection; speech activity detection; speech detection
- Citation
- IEEE SIGNAL PROCESSING LETTERS, v.27, pp.131 - 135
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE SIGNAL PROCESSING LETTERS
- Volume
- 27
- Start Page
- 131
- End Page
- 135
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/59124
- DOI
- 10.1109/LSP.2019.2959917
- ISSN
- 1070-9908
- Abstract
- Voice Activity Detection (VAD) systems suffer from unexpected and non-stationary background noises at magnitudes sufficiently high to mask the speech signal.Although several methods of increasing the performance of VAD have been proposed, their approaches have yet to mitigate the influence of the background noise itself. This letter proposes an effective noise-robust VAD system approach. The proposed method uses spectral attention and temporal attention through applying a deep learning-based attention mechanism. The proposed method is demonstrated and compared with several other deep learning-based methods in terms of the area under the curve in experiments with either known or unknown noise-added, and real-world noisy data. The results show that the proposed method outperforms the other methods in all the scenarios considered, but moreover generalizes well in environments of unknown or unexpected noise.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > School of Electrical Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.