Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

An Automatic Post Editing With Efficient and Simple Data Generation Method

Authors
Moon, HyeonseokPark, ChanjunSeo, JaehyungEo, SugyeongLim, Heuiseok
Issue Date
2022
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Automatic post editing; neural machine translation; data generation; machine translation; post editing
Citation
IEEE ACCESS, v.10, pp.21032 - 21040
Indexed
SCIE
SCOPUS
Journal Title
IEEE ACCESS
Volume
10
Start Page
21032
End Page
21040
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/139466
DOI
10.1109/ACCESS.2022.3152001
ISSN
2169-3536
Abstract
Automatic post-editing (APE) research considers methods for correcting translation results inferred by machine translation systems. The training of APE models, generally require triplets including a source sentence (src), machine translation sentence (mt), and post-edited sentence (pe). As considerable expert-level human labor is required in creating pe, APE researches have encountered difficulty in constructing suitable dataset for most of language pairs. This has led to the absence of APE data for most of language pairs, such as Korean-English, and imposed limitation to the sustainable researches of APE. Motivated by this problem, we propose a method that can generate APE triplets using only a parallel corpus without human labor. Our proposal comprises three noise generation techniques, including random, part of speech tagging (POS) based, and semantic level noises, and the effectiveness of these methods are verified by the results of quantitative and qualitative experiments on Korean-English APE tasks. As a result of our experiments, we find that POS based noise encourages the best APE performance. The proposed method is influential in that it can obviate expert human labor which was generally required in APE data construction, and enable the sustainable APE researches for the most language pairs where human-edited APE triplets are unavailable.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE