Dilated convolution and gated linear unit based sound event detection and tagging algorithm using weak label
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Park, Chungho | - |
dc.contributor.author | Kim, Donghyun | - |
dc.contributor.author | Ko, Hanseok | - |
dc.date.accessioned | 2021-08-31T16:10:02Z | - |
dc.date.available | 2021-08-31T16:10:02Z | - |
dc.date.created | 2021-06-18 | - |
dc.date.issued | 2020 | - |
dc.identifier.issn | 1225-4428 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/59028 | - |
dc.description.abstract | In this paper, we propose a Dilated Convolution Gate Linear Unit (DCGLU) to mitigate the lack of sparsity and small receptive field problems caused by the segmentation map extraction process in sound event detection with weak labels. In the advent of deep learning framework, segmentation map extraction approaches have shown improved performance in noisy environments. However, these methods are forced to maintain the size of the feature map to extract the segmentation map as the model would be constructed without a pooling operation. As a result, the performance of these methods is deteriorated with a lack of sparsity and a small receptive field. To mitigate these problems, we utilize GLU to control the flow of information and Dilated Convolutional Neural Networks (DCNNs) to increase the receptive field without additional learning parameters. For the performance evaluation, we employ a URBAN-SED and self-organized bird sound dataset. The relevant experiments show that our proposed DCGLU model outperforms over other baselines. In particular, our method is shown to exhibit robustness against nature sound noises with three Signal to Noise Ratio (SNR) levels (20 dB, 10 dB and 0 dB). | - |
dc.language | Korean | - |
dc.language.iso | ko | - |
dc.publisher | ACOUSTICAL SOC KOREA | - |
dc.subject | CLASSIFICATION | - |
dc.title | Dilated convolution and gated linear unit based sound event detection and tagging algorithm using weak label | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Ko, Hanseok | - |
dc.identifier.doi | 10.7776/ASK.2020.39.5.414 | - |
dc.identifier.wosid | 000594710300005 | - |
dc.identifier.bibliographicCitation | JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, v.39, no.5, pp.414 - 423 | - |
dc.relation.isPartOf | JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA | - |
dc.citation.title | JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA | - |
dc.citation.volume | 39 | - |
dc.citation.number | 5 | - |
dc.citation.startPage | 414 | - |
dc.citation.endPage | 423 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.identifier.kciid | ART002628514 | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scopus | - |
dc.description.journalRegisteredClass | kci | - |
dc.relation.journalResearchArea | Acoustics | - |
dc.relation.journalWebOfScienceCategory | Acoustics | - |
dc.subject.keywordPlus | CLASSIFICATION | - |
dc.subject.keywordAuthor | Audio tagging | - |
dc.subject.keywordAuthor | Sound event detection | - |
dc.subject.keywordAuthor | Dilated convolution | - |
dc.subject.keywordAuthor | Gated linear unit | - |
dc.subject.keywordAuthor | T-f segmentation map | - |
dc.subject.keywordAuthor | Weak label | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.