Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Ancient Korean Neural Machine Translation

Authors
Park, ChanjunLee, ChanheeYang, YeongwookLim, Heuiseok
Issue Date
2020
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Ancient Korean translation; neural machine translation; transformer; subword tokenization; share vocabulary and entity restriction byte pair encoding
Citation
IEEE ACCESS, v.8, pp.116617 - 116625
Indexed
SCIE
SCOPUS
Journal Title
IEEE ACCESS
Volume
8
Start Page
116617
End Page
116625
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/58954
DOI
10.1109/ACCESS.2020.3004879
ISSN
2169-3536
Abstract
Translation of the languages of ancient times can serve as a source for the content of various digital media and can be helpful in various fields such as natural phenomena, medicine, and science. Owing to these needs, there has been a global movement to translate ancient languages, but expert minds are required for this purpose. It is difficult to train language experts, and more importantly, manual translation is a slow process. Consequently, the recovery of ancient characters using machine translation has been recently investigated, but there is currently no literature on the machine translation of ancient Korean. This paper proposes the first ancient Korean neural machine translation model using a Transformer. This model can improve the efficiency of a translator by quickly providing a draft translation for a number of untranslated ancient documents. Furthermore, a new subword tokenization method called the Share Vocabulary and Entity Restriction Byte Pair Encoding is proposed based on the characteristics of ancient Korean sentences. This proposed method yields an increase in the performance of the original conventional subword tokenization methods such as byte pair encoding by 5.25 BLEU points. In addition, various decoding strategies such as n-gram blocking and ensemble models further improve the performance by 2.89 BLEU points. The model has been made publicly available as a software application.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE