트랜스포머기반의 멀티모달 영상자막 생성요약Multi-Modal Abstractive Summarization based Transformer using Video Transcripts
- Other Titles
- Multi-Modal Abstractive Summarization based Transformer using Video Transcripts
- Authors
- 이민예; 한성원
- Issue Date
- 2021
- Publisher
- 대한산업공학회
- Keywords
- Abstractive Summarization; Multi-Modal; Transformer
- Citation
- 대한산업공학회지, v.47, no.5, pp.433 - 443
- Indexed
- KCI
- Journal Title
- 대한산업공학회지
- Volume
- 47
- Number
- 5
- Start Page
- 433
- End Page
- 443
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/138258
- ISSN
- 1225-0988
- Abstract
- In this paper, we propose a MASTF methodology, which is a Multimodal Abstractive Summarization based on Transformer. Neural network models applied in the field of generative summaries utilizing conventional multi-modals were techniques utilizing hierarchical attention based on circulating neural networks. Although transformers showed excellent performance in various natural language processing fields, including generative summaries, there were no cases of application in multimodal-based generative summaries. Thus, in this paper, we use transformers to improve the performance of multimodal image subtitle generation summary models. Transformer-based models outperform hierarchical attention-based models by 24.17% on ROUGE-L basis and 10.52% on combining speech and text.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > School of Industrial and Management Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.