Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embeddingopen access
- Authors
- Kim, M.; Kang, P.
- Issue Date
- 2022
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- Data models; Extrapolation; Interpolation; Semantics; Task analysis; Training; Transformers
- Citation
- IEEE Access, v.10, pp.8363 - 8376
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE Access
- Volume
- 10
- Start Page
- 8363
- End Page
- 8376
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/143202
- DOI
- 10.1109/ACCESS.2022.3142843
- ISSN
- 2169-3536
- Abstract
- Pre-trained language models (LMs) have been shown to achieve outstanding performance in various natural language processing tasks; however, these models have a significantly large number of parameters to handle large-scale text corpora during the pre-training process, and thus, they entail the risk of overfitting when fine-tuning for small task-oriented datasets is conducted. In this paper, we propose a text embedding augmentation method to prevent such overfitting. The proposed method applies augmentation to a text embedding by generating an adversarial embedding, which is not identical to original input embedding but maintaining the characteristics of the original input embedding, using PGD-based adversarial training for input text data. A pseudo-label that is identical to the label of the input text is then assigned to adversarial embedding to conduct retraining by using adversarial embedding and pseudo-label as input embedding and label pair for a separate LM. Experimental results on several text classification benchmark datasets demonstrated that the proposed method effectively prevented overfitting, which commonly occurs when adjusting a large-scale pre-trained LM to a specific task. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > School of Industrial and Management Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.