Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

The ASR Post-Processor Performance Challenges of BackTranScription (BTS): Data-Centric and Model-Centric Approaches

Full metadata record
DC Field Value Language
dc.contributor.authorPark, Chanjun-
dc.contributor.authorSeo, Jaehyung-
dc.contributor.authorLee, Seolhwa-
dc.contributor.authorLee, Chanhee-
dc.contributor.authorLim, Heuiseok-
dc.date.accessioned2022-11-15T22:40:19Z-
dc.date.available2022-11-15T22:40:19Z-
dc.date.created2022-11-15-
dc.date.issued2022-10-
dc.identifier.issn2227-7390-
dc.identifier.urihttps://scholar.korea.ac.kr/handle/2021.sw.korea/145525-
dc.description.abstractTraining an automatic speech recognition (ASR) post-processor based on sequence-to-sequence (S2S) requires a parallel pair (e.g., speech recognition result and human post-edited sentence) to construct the dataset, which demands a great amount of human labor. BackTransScription (BTS) proposes a data-building method to mitigate the limitations of the existing S2S based ASR post-processors, which can automatically generate vast amounts of training datasets, reducing time and cost in data construction. Despite the emergence of this novel approach, the BTS-based ASR post-processor still has research challenges and is mostly untested in diverse approaches. In this study, we highlight these challenges through detailed experiments by analyzing the data-centric approach (i.e., controlling the amount of data without model alteration) and the model-centric approach (i.e., model modification). In other words, we attempt to point out problems with the current trend of research pursuing a model-centric approach and alert against ignoring the importance of the data. Our experiment results show that the data-centric approach outperformed the model-centric approach by +11.69, +17.64, and +19.02 in the F1-score, BLEU, and GLEU tests.-
dc.languageEnglish-
dc.language.isoen-
dc.publisherMDPI-
dc.titleThe ASR Post-Processor Performance Challenges of BackTranScription (BTS): Data-Centric and Model-Centric Approaches-
dc.typeArticle-
dc.contributor.affiliatedAuthorLim, Heuiseok-
dc.identifier.doi10.3390/math10193618-
dc.identifier.scopusid2-s2.0-85139941211-
dc.identifier.wosid000867915700001-
dc.identifier.bibliographicCitationMATHEMATICS, v.10, no.19-
dc.relation.isPartOfMATHEMATICS-
dc.citation.titleMATHEMATICS-
dc.citation.volume10-
dc.citation.number19-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaMathematics-
dc.relation.journalWebOfScienceCategoryMathematics-
dc.subject.keywordAuthorbacktranscription-
dc.subject.keywordAuthormachine translation-
dc.subject.keywordAuthordata-centric-
dc.subject.keywordAuthormodel-centric-
dc.subject.keywordAuthorautomatic speech recognition-
dc.subject.keywordAuthorpost-processor-
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE