Word-Level Quality Estimation for Korean-English Neural Machine Translation

Eo, Sugyeong; Park, Chanjun; Moon, Hyeonseok; Seo, Jaehyung; Lim, Heuiseok

doi:10.1109/ACCESS.2022.3169155

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Word-Level Quality Estimation for Korean-English Neural Machine Translation

Full metadata record

DC Field	Value	Language
dc.contributor.author	Eo, Sugyeong	-
dc.contributor.author	Park, Chanjun	-
dc.contributor.author	Moon, Hyeonseok	-
dc.contributor.author	Seo, Jaehyung	-
dc.contributor.author	Lim, Heuiseok	-
dc.date.accessioned	2022-06-12T16:41:04Z	-
dc.date.available	2022-06-12T16:41:04Z	-
dc.date.created	2022-06-09	-
dc.date.issued	2022	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/142159	-
dc.description.abstract	Quality estimation (QE) task aims to predict the machine translation (MT) quality well by referring to the source sentence and its MT output. The various applicability of QE proves the importance of QE research, but the enormous human labor to construct the QE dataset remains a challenge. This study proposes three automatic word-level pseudo-QE data construction strategies using a monolingual or parallel corpus and an external machine translator without human labor. We utilize these individual pseudo-QE datasets to finetune multilingual pretrained language models such as cross-lingual language models (XLM), XLM-RoBERTa, and multilingual BART and comparatively analyze the results. Considering the synthetic dataset creation setup, we attempt to validate the objectivity of the QE model by leveraging four test sets translated by external translators from Google, Amazon, Microsoft, and Systran. As a result, XLM-R-large shows the best performance among mPLMs. We also verify the reliability of the QE model through the close performance gaps between different test sets. To the best of our knowledge, this is the first study to experiment with word-level Korean-English QE.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Word-Level Quality Estimation for Korean-English Neural Machine Translation	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Lim, Heuiseok	-
dc.identifier.doi	10.1109/ACCESS.2022.3169155	-
dc.identifier.scopusid	2-s2.0-85129225412	-
dc.identifier.wosid	000790767300001	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.10, pp.44964 - 44973	-
dc.relation.isPartOf	IEEE ACCESS	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	10	-
dc.citation.startPage	44964	-
dc.citation.endPage	44973	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordAuthor	Predictive models	-
dc.subject.keywordAuthor	Data models	-
dc.subject.keywordAuthor	Feature extraction	-
dc.subject.keywordAuthor	Task analysis	-
dc.subject.keywordAuthor	Annotations	-
dc.subject.keywordAuthor	Costs	-
dc.subject.keywordAuthor	Machine translation	-
dc.subject.keywordAuthor	Quality estimation	-
dc.subject.keywordAuthor	neural machine translation	-
dc.subject.keywordAuthor	multilingual pretrained language model	-
dc.subject.keywordAuthor	natural language processing	-

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,700,451; Today View :31,885

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE