Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Orthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition

Full metadata record
DC Field Value Language
dc.contributor.authorKao, Chao-Yuan-
dc.contributor.authorPark, Sangwook-
dc.contributor.authorBadi, Alzahra-
dc.contributor.authorHan, David K.-
dc.contributor.authorKo, Hanseok-
dc.date.accessioned2021-08-31T01:11:23Z-
dc.date.available2021-08-31T01:11:23Z-
dc.date.created2021-06-19-
dc.date.issued2020-05-
dc.identifier.issn1745-1361-
dc.identifier.urihttps://scholar.korea.ac.kr/handle/2021.sw.korea/56123-
dc.description.abstractPerformance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models.-
dc.languageEnglish-
dc.language.isoen-
dc.publisherIEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG-
dc.titleOrthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition-
dc.typeArticle-
dc.contributor.affiliatedAuthorKo, Hanseok-
dc.identifier.doi10.1587/transinf.2019EDL8183-
dc.identifier.scopusid2-s2.0-85084854925-
dc.identifier.wosid000530668200034-
dc.identifier.bibliographicCitationIEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E103D, no.5, pp.1195 - 1198-
dc.relation.isPartOfIEICE TRANSACTIONS ON INFORMATION AND SYSTEMS-
dc.citation.titleIEICE TRANSACTIONS ON INFORMATION AND SYSTEMS-
dc.citation.volumeE103D-
dc.citation.number5-
dc.citation.startPage1195-
dc.citation.endPage1198-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryComputer Science, Software Engineering-
dc.subject.keywordAuthorspeech enhancement-
dc.subject.keywordAuthorgenerative adversarial networks-
dc.subject.keywordAuthordeep learning-
dc.subject.keywordAuthorrobust speech recognition-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > School of Electrical Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Ko, Han seok photo

Ko, Han seok
College of Engineering (School of Electrical Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE