Bird sounds classification by combining PNCC and robust Mel-log filter bank features

Badi, Alzahra; Ko, Kyungdeuk; Ko, Hanseok

doi:10.7776/ASK.2019.38.1.039

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Bird sounds classification by combining PNCC and robust Mel-log filter bank features

Full metadata record

DC Field	Value	Language
dc.contributor.author	Badi, Alzahra	-
dc.contributor.author	Ko, Kyungdeuk	-
dc.contributor.author	Ko, Hanseok	-
dc.date.accessioned	2021-09-01T21:53:02Z	-
dc.date.available	2021-09-01T21:53:02Z	-
dc.date.created	2021-06-19	-
dc.date.issued	2019-01	-
dc.identifier.issn	1225-4428	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/68433	-
dc.description.abstract	In this paper, combining features is proposed as a way to enhance the classification accuracy of sounds under noisy environments using the CNN (Convolutional Neural Network) structure. A robust log Mel-filter bank using Wiener filter and PNCCs (Power Normalized Cepstral Coefficients) are extracted to form a 2-dimensional feature that is used as input to the CNN structure. An ebird database is used to classify 43 types of bird species in their natural environment. To evaluate the performance of the combined features under noisy environments, the database is augmented with 3 types of noise under 4 different SNRs (Signal to Noise Ratios) (20 dB, 10 dB, 5 dB, 0 dB). The combined feature is compared to the log Mel-filter bank with and without incorporating the Wiener filter and the PNCCs. The combined feature is shown to outperform the other mentioned features under clean environments with a 1.34 % increase in overall average accuracy. Additionally, the accuracy under noisy environments at the 4 SNR levels is increased by 1.06 % and 0.65 % for shop and schoolyard noise backgrounds, respectively.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	ACOUSTICAL SOC KOREA	-
dc.title	Bird sounds classification by combining PNCC and robust Mel-log filter bank features	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Ko, Hanseok	-
dc.identifier.doi	10.7776/ASK.2019.38.1.039	-
dc.identifier.scopusid	2-s2.0-85079185086	-
dc.identifier.wosid	000457557300005	-
dc.identifier.bibliographicCitation	JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, v.38, no.1, pp.39 - 46	-
dc.relation.isPartOf	JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA	-
dc.citation.title	JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA	-
dc.citation.volume	38	-
dc.citation.number	1	-
dc.citation.startPage	39	-
dc.citation.endPage	46	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.identifier.kciid	ART002434508	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scopus	-
dc.description.journalRegisteredClass	kci	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.subject.keywordAuthor	Acoustic event recognition	-
dc.subject.keywordAuthor	Environmental sound classification	-
dc.subject.keywordAuthor	CNN (Convolutional Neural Network)	-
dc.subject.keywordAuthor	Weiner filter	-
dc.subject.keywordAuthor	PNCCs (Power Normalized Cepstral Coefficients)	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > School of Electrical Engineering > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Ko, Han seok photo

Ko, Han seok: 공과대학 (전기전자공학부)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,700,451; Today View :31,885

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE