혐오와 대항: 혐오표현 탐지 모델 평가를 위한 대항표현 데이터셋 구축Countering the hatred: The counter-speech dataset in Korean for evaluating hate speech detection models
- Other Titles
- Countering the hatred: The counter-speech dataset in Korean for evaluating hate speech detection models
- Authors
- 박하율; 박현아; 송상헌
- Issue Date
- 2022
- Publisher
- 담화·인지언어학회
- Keywords
- hate speech detection; counter-speech; language model; ethics in NLP
- Citation
- 담화와 인지, v.29, no.2, pp.1 - 23
- Indexed
- KCI
- Journal Title
- 담화와 인지
- Volume
- 29
- Number
- 2
- Start Page
- 1
- End Page
- 23
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/142042
- ISSN
- 1226-5691
- Abstract
- This study argues for the necessity of a Korean counter-speech dataset for ethical and effective hate speech detection research. Counter-speech is a response to online hate in order to stop the spread of hate speech and is considered an alternative approach to deleting and blocking. However, since counter-speech often employs offensive language or linguistic structures similar to hate speech, even the state-of-the-art hate speech detection models usually classify it as hate speech. This false positive bias risks silencing the language of minorities and their allies. However, the evaluation of Korean hate speech detection models remains untouched due to the absence of a Korean counter-speech dataset. Thus, we introduce the first Korean counter-speech dataset with annotations about target groups. We then tested a Korean hate speech detection model with our dataset, revealing a significant drop in the model’s accuracy from 97.9% to 42.7%.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Liberal Arts > Department of Linguistics > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.